AWS SDK for pandas

21 - Global Configurations

awswrangler has two ways to set global configurations that will override the regular default arguments configured in functions signatures.

  • Environment variables

  • wr.config

P.S. Check the function API doc to see if your function has some argument that can be configured through Global configurations.

P.P.S. One exception to the above mentioned rules is the ``botocore_config`` property. It cannot be set through environment variables but only via ``wr.config``. It will be used as the ``botocore.config.Config`` for all underlying ``boto3`` calls. The default config is ``botocore.config.Config(retries={“max_attempts”: 5}, connect_timeout=10, max_pool_connections=10)``. If you only want to change the retry behavior, you can use the environment variables ``AWS_MAX_ATTEMPTS`` and ``AWS_RETRY_MODE``. (see Boto3 documentation)

Environment Variables

[1]:
%env WR_DATABASE=default
%env WR_CTAS_APPROACH=False
%env WR_MAX_CACHE_SECONDS=900
%env WR_MAX_CACHE_QUERY_INSPECTIONS=500
%env WR_MAX_REMOTE_CACHE_ENTRIES=50
%env WR_MAX_LOCAL_CACHE_ENTRIES=100
env: WR_DATABASE=default
env: WR_CTAS_APPROACH=False
env: WR_MAX_CACHE_SECONDS=900
env: WR_MAX_CACHE_QUERY_INSPECTIONS=500
env: WR_MAX_REMOTE_CACHE_ENTRIES=50
env: WR_MAX_LOCAL_CACHE_ENTRIES=100
[2]:
import awswrangler as wr
import botocore
[3]:
wr.athena.read_sql_query("SELECT 1 AS FOO")
[3]:
foo
0 1

Resetting

[4]:
# Specific
wr.config.reset("database")
# All
wr.config.reset()

wr.config

[5]:
wr.config.database = "default"
wr.config.ctas_approach = False
wr.config.max_cache_seconds = 900
wr.config.max_cache_query_inspections = 500
wr.config.max_remote_cache_entries = 50
wr.config.max_local_cache_entries = 100
# Set botocore.config.Config that will be used for all boto3 calls
wr.config.botocore_config = botocore.config.Config(
    retries={"max_attempts": 10},
    connect_timeout=20,
    max_pool_connections=20
)
[6]:
wr.athena.read_sql_query("SELECT 1 AS FOO")
[6]:
foo
0 1

Visualizing

[7]:
wr.config
[7]:
name Env. Variable type nullable enforced configured value
0 catalog_id WR_CATALOG_ID <class 'str'> True False False None
1 concurrent_partitioning WR_CONCURRENT_PARTITIONING <class 'bool'> False False False None
2 ctas_approach WR_CTAS_APPROACH <class 'bool'> False False True False
3 database WR_DATABASE <class 'str'> True False True default
4 max_cache_query_inspections WR_MAX_CACHE_QUERY_INSPECTIONS <class 'int'> False False True 500
5 max_cache_seconds WR_MAX_CACHE_SECONDS <class 'int'> False False True 900
6 max_remote_cache_entries WR_MAX_REMOTE_CACHE_ENTRIES <class 'int'> False False True 50
7 max_local_cache_entries WR_MAX_LOCAL_CACHE_ENTRIES <class 'int'> False False True 100
8 s3_block_size WR_S3_BLOCK_SIZE <class 'int'> False True False None
9 workgroup WR_WORKGROUP <class 'str'> False True False None
10 chunksize WR_CHUNKSIZE <class 'int'> False True False None
11 s3_endpoint_url WR_S3_ENDPOINT_URL <class 'str'> True True True None
12 athena_endpoint_url WR_ATHENA_ENDPOINT_URL <class 'str'> True True True None
13 sts_endpoint_url WR_STS_ENDPOINT_URL <class 'str'> True True True None
14 glue_endpoint_url WR_GLUE_ENDPOINT_URL <class 'str'> True True True None
15 redshift_endpoint_url WR_REDSHIFT_ENDPOINT_URL <class 'str'> True True True None
16 kms_endpoint_url WR_KMS_ENDPOINT_URL <class 'str'> True True True None
17 emr_endpoint_url WR_EMR_ENDPOINT_URL <class 'str'> True True True None
18 lakeformation_endpoint_url WR_LAKEFORMATION_ENDPOINT_URL <class 'str'> True True True None
19 dynamodb_endpoint_url WR_DYNAMODB_ENDPOINT_URL <class 'str'> True True True None
20 secretsmanager_endpoint_url WR_SECRETSMANAGER_ENDPOINT_URL <class 'str'> True True True None
21 timestream_endpoint_url WR_TIMESTREAM_ENDPOINT_URL <class 'str'> True True True None
22 botocore_config WR_BOTOCORE_CONFIG <class 'botocore.config.Config'> True False True <botocore.config.Config object at 0x14f313e50>
23 verify WR_VERIFY <class 'str'> True False True None
24 address WR_ADDRESS <class 'str'> True False False None
25 redis_password WR_REDIS_PASSWORD <class 'str'> True False False None
26 ignore_reinit_error WR_IGNORE_REINIT_ERROR <class 'bool'> True False False None
27 include_dashboard WR_INCLUDE_DASHBOARD <class 'bool'> True False False None
28 log_to_driver WR_LOG_TO_DRIVER <class 'bool'> True False False None
29 object_store_memory WR_OBJECT_STORE_MEMORY <class 'int'> True False False None
30 cpu_count WR_CPU_COUNT <class 'int'> True False False None
31 gpu_count WR_GPU_COUNT <class 'int'> True False False None
[ ]: