awswrangler.s3.read_deltalake

awswrangler.s3.read_deltalake(path: str | None = None, version: int | None = None, partitions: List[Tuple[str, str, Any]] | None = None, columns: List[str] | None = None, without_files: bool = False, boto3_session: Session | None = None, s3_additional_kwargs: Dict[str, str] | None = None, pyarrow_additional_kwargs: Dict[str, Any] | None = None) DataFrame

Load a Deltalake table data from an S3 path.

This function requires the deltalake package. See the How to load a Delta table guide for loading instructions.

Parameters:
  • path (Optional[str]) – The path of the DeltaTable.

  • version (Optional[int]) – The version of the DeltaTable.

  • partitions (Optional[List[Tuple[str, str, Any]]) – A list of partition filters, see help(DeltaTable.files_by_partitions) for filter syntax.

  • columns (Optional[List[str]]) – The columns to project. This can be a list of column names to include (order and duplicates are preserved).

  • without_files (bool) – If True, load the table without tracking files (memory-friendly). Some append-only applications might not need to track files.

  • boto3_session (Optional[boto3.Session()]) – Boto3 Session. If None, the default boto3 session is used.

  • s3_additional_kwargs (Optional[Dict[str, str]]) – Forwarded to the Delta Table class for the storage options of the S3 backend.

  • pyarrow_additional_kwargs (Optional[Dict[str, str]]) – Forwarded to the PyArrow to_pandas method.

Returns:

df – DataFrame with the results.

Return type:

pd.DataFrame

See also

deltalake.DeltaTable

Create a DeltaTable instance with the deltalake library.