awswrangler.neptune.bulk_load¶

awswrangler.neptune.bulk_load(client: NeptuneClient, df: DataFrame, path: str, iam_role: str, neptune_load_wait_polling_delay: float = 0.25, load_parallelism: Literal['LOW', 'MEDIUM', 'HIGH', 'OVERSUBSCRIBE'] = 'HIGH', parser_configuration: BulkLoadParserConfiguration | None = None, update_single_cardinality_properties: Literal['TRUE', 'FALSE'] = 'FALSE', queue_request: Literal['TRUE', 'FALSE'] = 'FALSE', dependencies: list[str] | None = None, keep_files: bool = False, use_threads: bool | int = True, boto3_session: Session | None = None, s3_additional_kwargs: dict[str, str] | None = None) → None¶

Write records into Amazon Neptune using the Neptune Bulk Loader.

The DataFrame will be written to S3 and then loaded to Neptune using the Bulk Loader.

Parameters:

client (NeptuneClient) – Instance of the neptune client to use
df (DataFrame) – Pandas DataFrame to write to Neptune.
path (str) – S3 Path that the Neptune Bulk Loader will load data from.
iam_role (str) – The Amazon Resource Name (ARN) for an IAM role to be assumed by the Neptune DB instance for access to the S3 bucket. For information about creating a role that has access to Amazon S3 and then associating it with a Neptune cluster, see Prerequisites: IAM Role and Amazon S3 Access.
neptune_load_wait_polling_delay (float) – Interval in seconds for how often the function will check if the Neptune bulk load has completed.
load_parallelism (Literal['LOW', 'MEDIUM', 'HIGH', 'OVERSUBSCRIBE']) – Specifies the number of threads used by Neptune’s bulk load process.
parser_configuration (BulkLoadParserConfiguration | None) – An optional object with additional parser configuration values. Each of the child parameters is also optional: namedGraphUri, baseUri and allowEmptyStrings.
update_single_cardinality_properties (Literal['TRUE', 'FALSE']) – An optional parameter that controls how the bulk loader treats a new value for single-cardinality vertex or edge properties.
queue_request (Literal['TRUE', 'FALSE']) –
An optional flag parameter that indicates whether the load request can be queued up or not.

If omitted or set to "FALSE", the load request will fail if another load job is already running.
dependencies (list[str] | None) – An optional parameter that can make a queued load request contingent on the successful completion of one or more previous jobs in the queue.
keep_files (bool) – Whether to keep stage files or delete them. False by default.
use_threads (bool | int) – True to enable concurrent requests, False to disable multiple threads. If enabled os.cpu_count() will be used as the max number of threads. If integer is provided, specified number is used.
boto3_session (Session | None) – The default boto3 session will be used if boto3_session is None.
s3_additional_kwargs (dict[str, str] | None) – Forwarded to botocore requests. e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}

Return type:

None

Examples

>>> import awswrangler as wr
>>> import pandas as pd
>>> client = wr.neptune.connect("MY_NEPTUNE_ENDPOINT", 8182)
>>> frame = pd.DataFrame([{"~id": "0", "~labels": ["version"], "~properties": {"type": "version"}}])
>>> wr.neptune.bulk_load(
...     client=client,
...     df=frame,
...     path="s3://my-bucket/stage-files/",
...     iam_role="arn:aws:iam::XXX:role/XXX"
... )