awswrangler.s3.to_deltalake

awswrangler.s3.to_deltalake(df: DataFrame, path: str, index: bool = False, mode: Literal['error', 'append', 'overwrite', 'ignore'] = 'append', dtype: Dict[str, str] | None = None, partition_cols: List[str] | None = None, overwrite_schema: bool = False, boto3_session: Session | None = None, s3_additional_kwargs: Dict[str, str] | None = None, s3_allow_unsafe_rename: bool = False) None

Write a DataFrame to S3 as a DeltaLake table.

This function requires the deltalake package.

Warning

This API is experimental and may change in future AWS SDK for Pandas releases.

Parameters:
  • df (pandas.DataFrame) – Pandas DataFrame

  • path (str) – S3 path for a directory where the DeltaLake table will be stored.

  • index (bool) – True to store the DataFrame index in file, otherwise False to ignore it.

  • mode (str, optional) – append (Default), overwrite, ignore, error

  • dtype (dict[str, str], optional) – Dictionary of columns names and Athena/Glue types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {'col name':'bigint', 'col2 name': 'int'})

  • partition_cols (list[str], optional) – List of columns to partition the table by. Only required when creating a new table.

  • overwrite_schema (bool) – If True, allows updating the schema of the table.

  • boto3_session (Optional[boto3.Session()]) – Boto3 Session. If None, the default boto3 session is used.

  • s3_additional_kwargs (Optional[Dict[str, str]]) – Forwarded to the Delta Table class for the storage options of the S3 backend.

  • s3_allow_unsafe_rename (bool) – Allows using the default S3 backend without support for concurrent writers. Concurrent writing is currently not supported, so this option needs to be turned on explicitely.

Examples

Writing a Pandas DataFrame into a DeltaLake table in S3.

>>> import awswrangler as wr
>>> import pandas as pd
>>> wr.s3.to_deltalake(
...     df=pd.DataFrame({'col': [1, 2, 3]}),
...     path='s3://bucket/prefix/',
...     s3_allow_unsafe_rename=True,
... )

See also

deltalake.DeltaTable

Create a DeltaTable instance with the deltalake library.

deltalake.write_deltalake

Write to a DeltaLake table.