awswrangler.postgresql.to_sql

awswrangler.postgresql.to_sql(df: DataFrame, con: pg8000.Connection, table: str, schema: str, mode: Literal['append', 'overwrite', 'upsert'] = 'append', overwrite_method: Literal['drop', 'cascade', 'truncate', 'truncate cascade'] = 'drop', index: bool = False, dtype: dict[str, str] | None = None, varchar_lengths: dict[str, int] | None = None, use_column_names: bool = False, chunksize: int = 200, upsert_conflict_columns: list[str] | None = None, insert_conflict_columns: list[str] | None = None, commit_transaction: bool = True) None

Write records stored in a DataFrame into PostgreSQL.

Note

This function has arguments which can be configured globally through wr.config or environment variables:

  • chunksize

Check out the Global Configurations Tutorial for details.

Parameters:
  • df (DataFrame) – Pandas DataFrame

  • con (Connection) – Use pg8000.connect() to use credentials directly or wr.postgresql.connect() to fetch it from the Glue Catalog.

  • table (str) – Table name

  • schema (str) – Schema name

  • mode (Literal['append', 'overwrite', 'upsert']) –

    Append, overwrite or upsert.

    • append: Inserts new records into table.

    • overwrite: Drops table and recreates.

    • upsert: Perform an upsert which checks for conflicts on columns given by upsert_conflict_columns and sets the new values on conflicts. Note that upsert_conflict_columns is required for this mode.

  • overwrite_method (Literal['drop', 'cascade', 'truncate', 'truncate cascade']) –

    Drop, cascade, truncate, or truncate cascade. Only applicable in overwrite mode.

    • ”drop” - DROP ... RESTRICT - drops the table. Fails if there are any views that depend on it.

    • ”cascade” - DROP ... CASCADE - drops the table, and all views that depend on it.

    • ”truncate” - TRUNCATE ... RESTRICT - truncates the table. Fails if any of the tables have foreign-key references from tables that are not listed in the command.

    • ”truncate cascade” - TRUNCATE ... CASCADE - truncates the table, and all tables that have foreign-key references to any of the named tables.

  • index (bool) – True to store the DataFrame index as a column in the table, otherwise False to ignore it.

  • dtype (dict[str, str] | None) – Dictionary of columns names and PostgreSQL types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {'col name': 'TEXT', 'col2 name': 'FLOAT'})

  • varchar_lengths (dict[str, int] | None) – Dict of VARCHAR length by columns. (e.g. {"col1": 10, "col5": 200}).

  • use_column_names (bool) –

    If set to True, will use the column names of the DataFrame for generating the INSERT SQL Query.

    E.g. If the DataFrame has two columns col1 and col3 and use_column_names is True, data will only be inserted into the database columns col1 and col3.

  • chunksize (int) – Number of rows which are inserted with each SQL query. Defaults to inserting 200 rows per query.

  • upsert_conflict_columns (list[str] | None) – This parameter is only supported if mode is set top upsert. In this case conflicts for the given columns are checked for evaluating the upsert.

  • insert_conflict_columns (list[str] | None) – This parameter is only supported if mode is set top append. In this case conflicts for the given columns are checked for evaluating the insert ‘ON CONFLICT DO NOTHING’.

  • commit_transaction (bool) – Whether to commit the transaction. True by default.

Return type:

None

Examples

Writing to PostgreSQL using a Glue Catalog Connections

>>> import awswrangler as wr
>>> with wr.postgresql.connect("MY_GLUE_CONNECTION") as con:
...     wr.postgresql.to_sql(
...         df=df,
...         table="my_table",
...         schema="public",
...         con=con
...     )