awswrangler.catalog.extract_athena_types(df: DataFrame, index: bool = False, partition_cols: list[str] | None = None, dtype: dict[str, str] | None = None, file_format: str = 'parquet') tuple[dict[str, str], dict[str, str]]

Extract columns and partitions types (Amazon Athena) from Pandas DataFrame.

  • df (pandas.DataFrame) – Pandas DataFrame.

  • index (bool) – Should consider the DataFrame index as a column?.

  • partition_cols (List[str], optional) – List of partitions names.

  • dtype (Dict[str, str], optional) – Dictionary of columns names and Athena/Glue types to be casted. Useful when you have columns with undetermined or mixed data types. (e.g. {‘col name’: ‘bigint’, ‘col2 name’: ‘int’})

  • file_format (str, optional) – File format to be considered to place the index column: “parquet” | “csv”.


columns_types: Dictionary with keys as column names and values as data types (e.g. {‘col0’: ‘bigint’, ‘col1’: ‘double’}). / partitions_types: Dictionary with keys as partition names and values as data types (e.g. {‘col2’: ‘date’}).

Return type:

Tuple[Dict[str, str], Dict[str, str]]


>>> import awswrangler as wr
>>> columns_types, partitions_types = wr.catalog.extract_athena_types(
...     df=df, index=False, partition_cols=["par0", "par1"], file_format="csv"
... )