AWS SDK for pandas 3.7.3
  • About
  • Install
  • At Scale
  • Tutorials
  • API Reference
  • License
  • Contribute
  • GitHub
  • awswrangler.opensearch.search
    • search()
  • « awswrangler.o...
  • awswrangler.o... »
  • awswrangler.opensearch.search¶

    awswrangler.opensearch.search(client: opensearchpy.OpenSearch, index: str | None = '_all', search_body: dict[str, Any] | None = None, doc_type: str | None = None, is_scroll: bool | None = False, filter_path: str | Collection[str] | None = None, **kwargs: Any) → pd.DataFrame¶

    Return results matching query DSL as pandas DataFrame.

    Parameters:
    • client (OpenSearch) – instance of opensearchpy.OpenSearch to use.

    • index (str, optional) – A comma-separated list of index names to search. use _all or empty string to perform the operation on all indices.

    • search_body (Dict[str, Any], optional) – The search definition using the Query DSL.

    • doc_type (str, optional) – Name of the document type (for Elasticsearch versions 5.x and earlier).

    • is_scroll (bool, optional) – Allows to retrieve a large numbers of results from a single search request using scroll for example, for machine learning jobs. Because scroll search contexts consume a lot of memory, we suggest you don’t use the scroll operation for frequent user queries.

    • filter_path (Union[str, Collection[str]], optional) – Use the filter_path parameter to reduce the size of the OpenSearch Service response (default: [‘hits.hits._id’,’hits.hits._source’])

    • **kwargs – KEYWORD arguments forwarded to opensearchpy.OpenSearch.search and also to opensearchpy.helpers.scan if is_scroll=True

    Returns:

    Results as Pandas DataFrame

    Return type:

    Union[pandas.DataFrame, Iterator[pandas.DataFrame]]

    Examples

    Searching an index using query DSL

    >>> import awswrangler as wr
    >>> client = wr.opensearch.connect(host='DOMAIN-ENDPOINT')
    >>> df = wr.opensearch.search(
    ...         client=client,
    ...         index='movies',
    ...         search_body={
    ...           "query": {
    ...             "match": {
    ...               "title": "wind"
    ...             }
    ...           }
    ...         }
    ...      )
    

    Back to top