AWS SDK for pandas 3.1.1
  • About
  • Install
  • At Scale
  • Tutorials
  • API Reference
  • License
  • Contribute
  • GitHub

awswrangler.opensearch.search¶

awswrangler.opensearch.search(client: opensearchpy.OpenSearch, index: str | None = '_all', search_body: Dict[str, Any] | None = None, doc_type: str | None = None, is_scroll: bool | None = False, filter_path: str | Collection[str] | None = None, **kwargs: Any) → DataFrame¶

Return results matching query DSL as pandas DataFrame.

Parameters:
  • client (OpenSearch) – instance of opensearchpy.OpenSearch to use.

  • index (str, optional) – A comma-separated list of index names to search. use _all or empty string to perform the operation on all indices.

  • search_body (Dict[str, Any], optional) – The search definition using the Query DSL.

  • doc_type (str, optional) – Name of the document type (for Elasticsearch versions 5.x and earlier).

  • is_scroll (bool, optional) – Allows to retrieve a large numbers of results from a single search request using scroll for example, for machine learning jobs. Because scroll search contexts consume a lot of memory, we suggest you don’t use the scroll operation for frequent user queries.

  • filter_path (Union[str, Collection[str]], optional) – Use the filter_path parameter to reduce the size of the OpenSearch Service response (default: [‘hits.hits._id’,’hits.hits._source’])

  • **kwargs – KEYWORD arguments forwarded to opensearchpy.OpenSearch.search and also to opensearchpy.helpers.scan if is_scroll=True

Returns:

Results as Pandas DataFrame

Return type:

Union[pandas.DataFrame, Iterator[pandas.DataFrame]]

Examples

Searching an index using query DSL

>>> import awswrangler as wr
>>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT')
>>> df = wr.opensearch.search(
...         client=client,
...         index='movies',
...         search_body={
...           "query": {
...             "match": {
...               "title": "wind"
...             }
...           }
...         }
...      )

Back to top