awswrangler.timestream.batch_load¶

awswrangler.timestream.batch_load(df: DataFrame, path: str, database: str, table: str, time_col: str, dimensions_cols: list[str], measure_cols: list[str], measure_name_col: str, report_s3_configuration: TimestreamBatchLoadReportS3Configuration, time_unit: Literal['MILLISECONDS', 'SECONDS', 'MICROSECONDS', 'NANOSECONDS'] = 'MILLISECONDS', record_version: int = 1, timestream_batch_load_wait_polling_delay: float = 2, keep_files: bool = False, use_threads: bool | int = True, boto3_session: Session | None = None, s3_additional_kwargs: dict[str, str] | None = None) → dict[str, Any]¶

Batch load a Pandas DataFrame into a Amazon Timestream table.

Note

The supplied column names (time, dimension, measure) MUST match those in the Timestream table.

Note

Only MultiMeasureMappings is supported. See https://docs.aws.amazon.com/timestream/latest/developerguide/batch-load-data-model-mappings.html

Note

Following arguments are not supported in distributed mode with engine EngineEnum.RAY:

boto3_session
s3_additional_kwargs

Note

This function has arguments which can be configured globally through wr.config or environment variables:

database
timestream_batch_load_wait_polling_delay

Check out the Global Configurations Tutorial for details.

Parameters:

df (DataFrame) – Pandas DataFrame.
path (str) – S3 prefix to write the data.
database (str) – Amazon Timestream database name.
table (str) – Amazon Timestream table name.
time_col (str) – Column name with the time data. It must be a long data type that represents the time since the Unix epoch.
dimensions_cols (list[str]) – List of column names with the dimensions data.
measure_cols (list[str]) – List of column names with the measure data.
measure_name_col (str) – Column name with the measure name.
report_s3_configuration (TimestreamBatchLoadReportS3Configuration) – Dictionary of the configuration for the S3 bucket where the error report is stored. https://docs.aws.amazon.com/timestream/latest/developerguide/API_ReportS3Configuration.html Example: {“BucketName”: ‘error-report-bucket-name’}
time_unit (Literal['MILLISECONDS', 'SECONDS', 'MICROSECONDS', 'NANOSECONDS']) – Time unit for the time column. MILLISECONDS by default.
record_version (int) – Record version.
timestream_batch_load_wait_polling_delay (float) – Time to wait between two polling attempts.
keep_files (bool) – Whether to keep the files after the operation.
use_threads (bool | int) – True to enable concurrent requests, False to disable multiple threads.
boto3_session (Session | None) – The default boto3 session will be used if boto3_session is None.
s3_additional_kwargs (dict[str, str] | None) – Forwarded to S3 botocore requests.

Return type:

dict[str, Any]

Returns:

A dictionary of the batch load task response.

Examples

>>> import awswrangler as wr

>>> response = wr.timestream.batch_load(
>>>     df=df,
>>>     path='s3://bucket/path/',
>>>     database='sample_db',
>>>     table='sample_table',
>>>     time_col='time',
>>>     dimensions_cols=['region', 'location'],
>>>     measure_cols=['memory_utilization', 'cpu_utilization'],
>>>     report_s3_configuration={'BucketName': 'error-report-bucket-name'},
>>> )