AWS SDK for pandas

24 - Athena Query Metadata

For wr.athena.read_sql_query() and wr.athena.read_sql_table() the resulting DataFrame (or every DataFrame in the returned Iterator for chunked queries) have a query_metadata attribute, which brings the query result metadata returned by Boto3/Athena.

The expected query_metadata format is the same returned by:

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html#Athena.Client.get_query_execution

Environment Variables

[1]:
%env WR_DATABASE=default
env: WR_DATABASE=default
[2]:
import awswrangler as wr
[5]:
df = wr.athena.read_sql_query("SELECT 1 AS foo")

df
[5]:
foo
0 1

Getting statistics from query metadata

[6]:
print(f"DataScannedInBytes:            {df.query_metadata['Statistics']['DataScannedInBytes']}")
print(f"TotalExecutionTimeInMillis:    {df.query_metadata['Statistics']['TotalExecutionTimeInMillis']}")
print(f"QueryQueueTimeInMillis:        {df.query_metadata['Statistics']['QueryQueueTimeInMillis']}")
print(f"QueryPlanningTimeInMillis:     {df.query_metadata['Statistics']['QueryPlanningTimeInMillis']}")
print(f"ServiceProcessingTimeInMillis: {df.query_metadata['Statistics']['ServiceProcessingTimeInMillis']}")
DataScannedInBytes:            0
TotalExecutionTimeInMillis:    2311
QueryQueueTimeInMillis:        121
QueryPlanningTimeInMillis:     250
ServiceProcessingTimeInMillis: 37