awswrangler.opensearch.index_json¶
- awswrangler.opensearch.index_json(client: opensearchpy.OpenSearch, path: str, index: str, doc_type: str | None = None, boto3_session: Session | None = Session(region_name=None), json_path: str | None = None, use_threads: bool | int = False, **kwargs: Any) Any ¶
Index all documents from JSON file to OpenSearch index.
The JSON file should be in a JSON-Lines text format (newline-delimited JSON) - https://jsonlines.org/ OR if the is a single large JSON please provide json_path.
- Parameters:
client (
OpenSearch
) – instance of opensearchpy.OpenSearch to use.path (
str
) – s3 or local path to the JSON file which contains the documents.index (
str
) – Name of the index.doc_type (
str
|None
) – Name of the document type (for Elasticsearch versions 5.x and earlier).json_path (
str
|None
) – JsonPath expression to specify explicit path to a single name element in a JSON hierarchical data structure. Read more about JsonPathboto3_session (
Session
|None
) – Boto3 Session to be used to access S3 if path is provided. The default boto3 session will be used if boto3_session isNone
.use_threads (
bool
|int
) – True to enable concurrent requests, False to disable multiple threads. If enabled os.cpu_count() will be used as the max number of threads. If integer is provided, specified number is used.**kwargs (
Any
) – KEYWORD arguments forwarded toindex_documents()
which is used to execute the operation
- Return type:
Any
- Returns:
Response payload https://opensearch.org/docs/opensearch/rest-api/document-apis/bulk/#response.
Examples
Writing contents of JSON file
>>> import awswrangler as wr >>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT') >>> wr.opensearch.index_json( ... client=client, ... path='docs.json', ... index='sample-index1' ... )