awswrangler.sqlserver.to_sql

awswrangler.sqlserver.to_sql(df: DataFrame, con: pyodbc.Connection, table: str, schema: str, mode: Literal['append', 'overwrite', 'upsert'] = 'append', index: bool = False, dtype: dict[str, str] | None = None, varchar_lengths: dict[str, int] | None = None, use_column_names: bool = False, upsert_conflict_columns: list[str] | None = None, chunksize: int = 200, fast_executemany: bool = False) None

Write records stored in a DataFrame into Microsoft SQL Server.

Note

This function has arguments which can be configured globally through wr.config or environment variables:

  • chunksize

Check out the Global Configurations Tutorial for details.

Parameters:
  • df (DataFrame) – Pandas DataFrame https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

  • con (Connection) – Use pyodbc.connect() to use credentials directly or wr.sqlserver.connect() to fetch it from the Glue Catalog.

  • table (str) – Table name

  • schema (str) – Schema name

  • mode (Literal['append', 'overwrite', 'upsert']) –

    Append, overwrite or upsert.

    • append: Inserts new records into table.

    • overwrite: Drops table and recreates.

    • upsert: Perform an upsert which checks for conflicts on columns given by upsert_conflict_columns and sets the new values on conflicts. Note that column names of the Dataframe will be used for this operation, as if use_column_names was set to True.

  • index (bool) – True to store the DataFrame index as a column in the table, otherwise False to ignore it.

  • dtype (dict[str, str] | None) – Dictionary of columns names and Microsoft SQL Server types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {‘col name’: ‘TEXT’, ‘col2 name’: ‘FLOAT’})

  • varchar_lengths (dict[str, int] | None) – Dict of VARCHAR length by columns. (e.g. {“col1”: 10, “col5”: 200}).

  • use_column_names (bool) – If set to True, will use the column names of the DataFrame for generating the INSERT SQL Query. E.g. If the DataFrame has two columns col1 and col3 and use_column_names is True, data will only be inserted into the database columns col1 and col3.

  • uspert_conflict_columns – List of columns to be used as conflict columns in the upsert operation.

  • chunksize (int) – Number of rows which are inserted with each SQL query. Defaults to inserting 200 rows per query.

  • fast_executemany (bool) –

    Mode of execution which greatly reduces round trips for a DBAPI executemany() call when using Microsoft ODBC drivers, for limited size batches that fit in memory. False by default.

    https://github.com/mkleehammer/pyodbc/wiki/Cursor#executemanysql-params-with-fast_executemanytrue

    Note: when using this mode, pyodbc converts the Python parameter values to their ODBC “C” equivalents, based on the target column types in the database which may lead to subtle data type conversion differences depending on whether fast_executemany is True or False.

Return type:

None

Examples

Writing to Microsoft SQL Server using a Glue Catalog Connections

>>> import awswrangler as wr
>>> with wr.sqlserver.connect(connection="MY_GLUE_CONNECTION", odbc_driver_version=17) as con:
...     wr.sqlserver.to_sql(
...         df=df,
...         table="table",
...         schema="dbo",
...         con=con
...     )