awswrangler.mysql.read_sql_table(table: str, con: pymysql.connections.Connection[Any], schema: Optional[str] = None, index_col: Optional[Union[str, List[str]]] = None, params: Optional[Union[List[Any], Tuple[Any, ...], Dict[Any, Any]]] = None, chunksize: Optional[int] = None, dtype: Optional[Dict[str, DataType]] = None, safe: bool = True, timestamp_as_object: bool = False) Union[DataFrame, Iterator[DataFrame]]

Return a DataFrame corresponding the table.

  • table (str) – Table name.

  • con (pymysql.connections.Connection) – Use pymysql.connect() to use credentials directly or wr.mysql.connect() to fetch it from the Glue Catalog.

  • schema (str, optional) – Name of SQL schema in database to query. Uses default schema if None.

  • index_col (Union[str, List[str]], optional) – Column(s) to set as index(MultiIndex).

  • params (Union[List, Tuple, Dict], optional) – List of parameters to pass to execute method. The syntax used to pass parameters is database driver dependent. Check your database driver documentation for which of the five syntax styles, described in PEP 249’s paramstyle, is supported.

  • chunksize (int, optional) – If specified, return an iterator where chunksize is the number of rows to include in each chunk.

  • dtype (Dict[str, pyarrow.DataType], optional) – Specifying the datatype for columns. The keys should be the column names and the values should be the PyArrow types.

  • safe (bool) – Check for overflows or other unsafe data type conversions.

  • timestamp_as_object (bool) – Cast non-nanosecond timestamps (np.datetime64) to objects.


Result as Pandas DataFrame(s).

Return type

Union[pandas.DataFrame, Iterator[pandas.DataFrame]]


Reading from MySQL using a Glue Catalog Connections

>>> import awswrangler as wr
>>> con = wr.mysql.connect("MY_GLUE_CONNECTION")
>>> df = wr.mysql.read_sql_table(
...     table="my_table",
...     schema="test",
...     con=con
... )
>>> con.close()