awswrangler.timestream.batch_load¶
- awswrangler.timestream.batch_load(df: DataFrame, path: str, database: str, table: str, time_col: str, dimensions_cols: list[str], measure_cols: list[str], measure_name_col: str, report_s3_configuration: TimestreamBatchLoadReportS3Configuration, time_unit: Literal['MILLISECONDS', 'SECONDS', 'MICROSECONDS', 'NANOSECONDS'] = 'MILLISECONDS', record_version: int = 1, timestream_batch_load_wait_polling_delay: float = 2, keep_files: bool = False, use_threads: bool | int = True, boto3_session: Session | None = None, s3_additional_kwargs: dict[str, str] | None = None) dict[str, Any] ¶
Batch load a Pandas DataFrame into a Amazon Timestream table.
Note
The supplied column names (time, dimension, measure) MUST match those in the Timestream table.
Note
Only
MultiMeasureMappings
is supported. See https://docs.aws.amazon.com/timestream/latest/developerguide/batch-load-data-model-mappings.htmlNote
Following arguments are not supported in distributed mode with engine EngineEnum.RAY:
boto3_session
s3_additional_kwargs
Note
This function has arguments which can be configured globally through wr.config or environment variables:
database
timestream_batch_load_wait_polling_delay
Check out the Global Configurations Tutorial for details.
- Parameters:
df (
DataFrame
) – Pandas DataFrame.path (
str
) – S3 prefix to write the data.database (
str
) – Amazon Timestream database name.table (
str
) – Amazon Timestream table name.time_col (
str
) – Column name with the time data. It must be a long data type that represents the time since the Unix epoch.dimensions_cols (
list
[str
]) – List of column names with the dimensions data.measure_cols (
list
[str
]) – List of column names with the measure data.measure_name_col (
str
) – Column name with the measure name.report_s3_configuration (
TimestreamBatchLoadReportS3Configuration
) – Dictionary of the configuration for the S3 bucket where the error report is stored. https://docs.aws.amazon.com/timestream/latest/developerguide/API_ReportS3Configuration.html Example: {“BucketName”: ‘error-report-bucket-name’}time_unit (
Literal
['MILLISECONDS'
,'SECONDS'
,'MICROSECONDS'
,'NANOSECONDS'
]) – Time unit for the time column. MILLISECONDS by default.record_version (
int
) – Record version.timestream_batch_load_wait_polling_delay (
float
) – Time to wait between two polling attempts.keep_files (
bool
) – Whether to keep the files after the operation.use_threads (
bool
|int
) – True to enable concurrent requests, False to disable multiple threads.boto3_session (
Session
|None
) – The default boto3 session will be used if boto3_session isNone
.s3_additional_kwargs (
dict
[str
,str
] |None
) – Forwarded to S3 botocore requests.
- Return type:
dict
[str
,Any
]- Returns:
A dictionary of the batch load task response.
Examples
>>> import awswrangler as wr
>>> response = wr.timestream.batch_load( >>> df=df, >>> path='s3://bucket/path/', >>> database='sample_db', >>> table='sample_table', >>> time_col='time', >>> dimensions_cols=['region', 'location'], >>> measure_cols=['memory_utilization', 'cpu_utilization'], >>> report_s3_configuration={'BucketName': 'error-report-bucket-name'}, >>> )