awswrangler.athena.delete_from_iceberg_table¶
- awswrangler.athena.delete_from_iceberg_table(df: DataFrame, database: str, table: str, merge_cols: list[str], temp_path: str | None = None, keep_files: bool = True, data_source: str | None = None, s3_output: str | None = None, workgroup: str = 'primary', encryption: str | None = None, kms_key: str | None = None, dtype: dict[str, str] | None = None, boto3_session: Session | None = None, s3_additional_kwargs: dict[str, Any] | None = None, catalog_id: str | None = None) None¶
Delete rows from an Iceberg table.
Creates temporary external table, writes staged files and then deletes any rows which match the contents of the temporary table.
- Parameters:
df (
DataFrame) – Pandas DataFrame containing the IDs of rows that are to be deleted from the Iceberg table.database (
str) – Database name.table (
str) – Table name.merge_cols (
list[str]) –List of columns to be used to determine which rows of the Iceberg table should be deleted.
temp_path (
str|None) – S3 path to temporarily store the DataFrame.keep_files (
bool) – Whether staging files produced by Athena are retained.Trueby default.data_source (
str|None) – The AWS KMS key ID or alias used to encrypt the data.s3_output (
str|None) – Amazon S3 path used for query execution.workgroup (
str) – Athena workgroup name.encryption (
str|None) – Valid values: [None,"SSE_S3","SSE_KMS"]. Notice:"CSE_KMS"is not supported.kms_key (
str|None) – For SSE-KMS, this is the KMS key ARN or ID.dtype (
dict[str,str] |None) – Dictionary of columns names and Athena/Glue types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {‘col name’: ‘bigint’, ‘col2 name’: ‘int’})boto3_session (
Session|None) – The default boto3 session will be used if boto3_session receiveNone.s3_additional_kwargs (
dict[str,Any] |None) – Forwarded to botocore requests. e.g.`s3_additional_kwargs={"RequestPayer": "requester"}`catalog_id (
str|None) – The ID of the Data Catalog which contains the database and table. If none is provided, the AWS account ID is used by default.
- Return type:
None
Examples
>>> import awswrangler as wr >>> import pandas as pd >>> df = pd.DataFrame({"id": [1, 2, 3], "col": ["foo", "bar", "baz"]}) >>> wr.athena.to_iceberg( ... df=df, ... database="my_database", ... table="my_table", ... temp_path="s3://bucket/temp/", ... ) >>> df_delete = pd.DataFrame({"id": [1, 3]}) >>> wr.athena.delete_from_iceberg_table( ... df=df_delete, ... database="my_database", ... table="my_table", ... merge_cols=["id"], ... ) >>> wr.athena.read_sql_table(table="my_table", database="my_database") id col 0 2 bar