awswrangler.opensearch.index_df

awswrangler.opensearch.index_df(client: opensearchpy.OpenSearch, df: DataFrame, index: str, doc_type: str | None = None, use_threads: bool | int = False, **kwargs: Any) Any

Index all documents from a DataFrame to OpenSearch index.

Parameters:
  • client (OpenSearch) – instance of opensearchpy.OpenSearch to use.

  • df (DataFrame) – Pandas DataFrame

  • index (str) – Name of the index.

  • doc_type (str | None) – Name of the document type (for Elasticsearch versions 5.x and earlier).

  • use_threads (bool | int) – True to enable concurrent requests, False to disable multiple threads. If enabled os.cpu_count() will be used as the max number of threads. If integer is provided, specified number is used.

  • **kwargs (Any) – KEYWORD arguments forwarded to index_documents() which is used to execute the operation

Return type:

Any

Returns:

Response payload https://opensearch.org/docs/opensearch/rest-api/document-apis/bulk/#response.

Examples

Writing rows of DataFrame

>>> import awswrangler as wr
>>> import pandas as pd
>>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT')
>>> wr.opensearch.index_df(
...     client=client,
...     df=pd.DataFrame([{'_id': '1'}, {'_id': '2'}, {'_id': '3'}]),
...     index='sample-index1',
... )