awswrangler.postgresql.to_sql¶
- awswrangler.postgresql.to_sql(df: DataFrame, con: Connection, table: str, schema: str, mode: str = 'append', index: bool = False, dtype: Dict[str, str] | None = None, varchar_lengths: Dict[str, int] | None = None, use_column_names: bool = False, chunksize: int = 200, upsert_conflict_columns: List[str] | None = None, insert_conflict_columns: List[str] | None = None) Any ¶
Write records stored in a DataFrame into PostgreSQL.
Note
This function has arguments which can be configured globally through wr.config or environment variables:
chunksize
Check out the Global Configurations Tutorial for details.
- Parameters:
df (pandas.DataFrame) – Pandas DataFrame https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html
con (pg8000.Connection) – Use pg8000.connect() to use credentials directly or wr.postgresql.connect() to fetch it from the Glue Catalog.
table (str) – Table name
schema (str) – Schema name
mode (str) –
- Append, overwrite or upsert.
append: Inserts new records into table. overwrite: Drops table and recreates. upsert: Perform an upsert which checks for conflicts on columns given by upsert_conflict_columns and sets the new values on conflicts. Note that upsert_conflict_columns is required for this mode.
index (bool) – True to store the DataFrame index as a column in the table, otherwise False to ignore it.
dtype (Dict[str, str], optional) – Dictionary of columns names and PostgreSQL types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {‘col name’: ‘TEXT’, ‘col2 name’: ‘FLOAT’})
varchar_lengths (Dict[str, int], optional) – Dict of VARCHAR length by columns. (e.g. {“col1”: 10, “col5”: 200}).
use_column_names (bool) – If set to True, will use the column names of the DataFrame for generating the INSERT SQL Query. E.g. If the DataFrame has two columns col1 and col3 and use_column_names is True, data will only be inserted into the database columns col1 and col3.
chunksize (int) – Number of rows which are inserted with each SQL query. Defaults to inserting 200 rows per query.
upsert_conflict_columns (List[str], optional) – This parameter is only supported if mode is set top upsert. In this case conflicts for the given columns are checked for evaluating the upsert.
insert_conflict_columns (List[str], optional) – This parameter is only supported if mode is set top append. In this case conflicts for the given columns are checked for evaluating the insert ‘ON CONFLICT DO NOTHING’.
- Returns:
None.
- Return type:
None
Examples
Writing to PostgreSQL using a Glue Catalog Connections
>>> import awswrangler as wr >>> con = wr.postgresql.connect("MY_GLUE_CONNECTION") >>> wr.postgresql.to_sql( ... df=df, ... table="my_table", ... schema="public", ... con=con ... ) >>> con.close()