



Enterprise Search is part of the Generative AI App Builder tool suite provided by Google Cloud.

Gen AI App Builder enables developers with limited machine learning skills to quickly and easily harness the power of Google’s underlying models, search expertise, and conversational AI technology to create enterprise-grade generative AI applications.

Enterprise Search enables organizations to rapidly build AI-powered generative search engines for their customers and employees. Enterprise Search is powered by a variety of Google search technologies, including Semantic Search, which uses natural language processing and machine learning techniques to infer relationships and intent within content from a user’s query input, providing more relevant results than traditional keyword-based search technologies. Enterprise Search also benefits from Google’s expertise in understanding how users search and sorts the results displayed for content relevance.

Google Cloud offers enterprise search via Gen App Builder in the Google Cloud Console and APIs for enterprise workflow integration.

This notebook shows you how to configure Enterprise Search and use the Enterprise Search Retriever. The Enterprise Search acquirer encapsulates the Generative AI App Builder Python client library and uses it to access the Enterprise Search Search Service API.

Install prerequisites

To use Enterprise Search Retriever, you need to install the google-cloud-discoverengine package.

pip install google-cloud-discoveryengine Configure access to Google Cloud and Google Cloud Enterprise Search

Enterprise Search will be generally available on the allowlist from June 6, 2023 (that is, customer access requires approval). For more information on access and pricing, please contact the Google Cloud sales team. We are previewing additional features that will be generally available as part of the Trusted Tester program. Register as a Trusted Tester and contact the Google Cloud sales team to request an expedited trial.

Before running this notebook, you should:

Set up or create a Google Cloud project and turn on Gen App BuilderCreate and configure an unstructured data storeSet up credentials to access the Enterprise Search APISet up or create a Google Cloud project and turn on Gen App Builder

Follow the steps in Getting Started with Enterprise Search to set up/create a GCP project and enable the Gen App Builder.

Create an unstructured data store to store data

Create an unstructured data store using the Google Cloud Console and populate it with sample PDF documents from the gs://cloud-samples-data/gen-app-builder/search/alphabet-investor-pdfs Cloud Storage folder.surely[クラウド ストレージ (メタデータなし)]Please use the option.

Set credentials for accessing the Enterprise Search API

The Gen App Builder client library used by Enterprise Search Retriever provides high-level language support for programmatic authentication to Gen App Builder. The client library supports Application Default Credentials (ADC). The library looks for credentials in a defined set of locations and uses those credentials to authenticate requests to your API. ADC makes credentials available to applications in a variety of environments, such as local development and production environments, without changing application code.

If you’re running on Google Colab, authenticate with google.colab.google.auth. Otherwise, follow one of the supported methods to ensure that Application Default Credentials are properly set.

import sysif “google.colab” in sys.modules: from google.colab import auth as google_auth google_auth.authenticate_user()Configure and use the Enterprise Search Retriever

The Enterprise Search Retriever is implemented in the langchain.retriever.GoogleCloudEntepriseSearchRetriever class. The get_relevan_documents method returns a list of langchain.schema.Document documents. The page_content field of each document is populated with extracted segments or extracted answers that match the query. The metadata field is populated with the metadata (if any) of the document from which the segment or answer was extracted.

The extracted answer is the verbatim text returned with each search result. Extracted directly from the original document. Snippet answers are typically displayed near the top of a web page and provide the end user with a short answer that is contextually relevant to their query. Extractive answers are available on the website and unstructured search.

Extract segments are verbatim text returned with each search result. Extract segments are usually more verbose than extract answers. Extracted segments can be viewed as answers to queries, used to perform post-processing tasks, or used as input for larger language models to generate answers or new text. Extract segments can be used for unstructured searches.

For more information on Extract Segments and Extract Answers, please refer to the product documentation.

When you create an instance of Retriever, you can specify a number of parameters that control how natural language queries are processed, such as which enterprise data store to access, extract answers, segment configuration, and more.

The required parameters are:

project_id – Google Cloud PROJECT_IDsearch_engine_id – ID of the data store to use.

The project_id and search_engine_id parameters can be specified explicitly in the getter’s constructor or through the environment variables PROJECT_ID and SEARCH_ENGINE_ID.

You can also set a number of optional parameters, such as:

max_documents – Maximum number of documents used to provide extracted segments or extracted answers get_extractive_answers – By default, the retriever is configured to return extracted segments. Set this field to True to return extractive answers max_extractive_answer_count – Maximum number of extractive answers to return for each search result. A maximum of 5 answers will be returned max_extractive_segment_count – Maximum number of extracted segments returned for each search result. Currently one segment is returned. filter – A filter expression that allows you to filter the search results based on the metadata associated with the documents in the searched data store. query_expansion_condition – A specification that determines under what conditions query expansion will occur. 0 – No query expansion condition specified. In this case, server behavior is disabled by default. 1 – Disable query expansion. Only exact search queries are used, even if SearchResponse.total_size is 0. 2 – Automatic query expansion built by the Search API. Configure and use a retriever that extracts segments from the langchain. retrievers import GoogleCloudEnterpriseSearchRetrieverPROJECT_ID = “ ” SEARCH_ENGINE_ID = “ “retriever = GoogleCloudEnterpriseSearchRetriever( project_id=PROJECT_ID, search_engine_id=SEARCH_ENGINE_ID, max_documents=3,)query = “What are Alphabet’s other bets?”result =retriever.get_relevant_documents(query)for doc in result: print(doc)Configure and use a retriever to extract answers retriever = GoogleCloudEnterpriseSearchRetriever(project_ id= PROJECT_ID, search_engine_id=SEARCH_ENGINE_ID, max_documents=3, max_extractive_answer_count=3, get_extractive_answers=True,)query = “What are Alphabet’s other bets?”result =retriever.get_relevant_documents(query)for doc in result: print(doc)

