awswrangler.s3.to_iceberg¶
- awswrangler.s3.to_iceberg(df: DataFrame, table_bucket_arn: str, namespace: str, table_name: str, mode: Literal['append', 'overwrite'] = 'append', index: bool = False, dtype: dict[str, str] | None = None, boto3_session: Session | None = None) None¶
Write a Pandas DataFrame to an S3 Table via PyIceberg.
If the table does not exist, it is automatically created with a schema inferred from the DataFrame.
This function requires the
pyicebergpackage. Install it withpip install awswrangler[pyiceberg].By default, the S3 Tables REST endpoint is used. To use the AWS Glue Iceberg REST endpoint instead, set
wr.config.s3tables_catalog_endpoint_url(e.g."https://glue.<region>.amazonaws.com/iceberg"). See Integrating S3 Tables with AWS analytics services for the required Glue Data Catalog and Lake Formation setup.- Parameters:
df (pd.DataFrame) – Pandas DataFrame to write.
table_bucket_arn (str) – The ARN of the S3 table bucket.
namespace (str) – The namespace of the table.
table_name (str) – The name of the table to write to.
mode (str, optional) – Write mode.
"append"(default) adds rows to the table."overwrite"replaces all existing data.index (bool, optional) – If True, include the DataFrame index as a column. Default is False.
dtype (dict[str, str], optional) – Dictionary of column names and Athena/Glue types to cast. (e.g.
{"col_name": "bigint", "col2_name": "int"}).boto3_session (boto3.Session, optional) – Boto3 Session. If None, the default boto3 session is used.
- Return type:
None
Examples
>>> import awswrangler as wr >>> import pandas as pd >>> wr.s3.to_iceberg( ... df=pd.DataFrame({"col": [1, 2, 3]}), ... table_bucket_arn="arn:aws:s3tables:us-east-1:123456789012:bucket/my-bucket", ... namespace="my_namespace", ... table_name="my_table", ... )