Skip to content

Latest commit

 

History

History
 
 

firestore-semantic-search

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Semantic Search with Vertex AI

Author: Firebase (https://firebase.google.com)

Description: Search for semantically similar text documents in Firestore with PaLM API and Vertex AI Matching Engine.

Details: > ⚠️ The Vertex Matching Engine Public Endpoints feature and PaLM API are currently in public preview.

For details and limitations, see the Vertex AI documentation and PaLM API documentation.

PaLM API is an optional feature of this extension. If you choose to use the PaLM model, please ensure that you have already signed up for the waitlist and have been approved.

This extension adds text similarity search to your Firestore application using Vertex AI’s Matching Engine. Text similarity search relies on first generating embeddings (vector representations of your original text) which are stored in a Matching Engine index. Once these embeddings are indexed, the Matching Engine can be used to calculate semantically similar documents to an original document from a large corpus of candidate documents, based on vector distance measures.

On installation, you will need to specify a Firestore collection path to index and the document fields to index.

Once installed, the extension does the following:

  1. Automatically generates and stores embeddings in Vertex AI whenever documents are created, updated, or deleted in target collection(s)
  2. Provides a secure API endpoint to query similar documents (given an input document) that can be used by client applications
  3. (Optional) Backfills existing data from target collection(s)

The query API endpoint is deployed as a Firebase Callable Function, and requires that you are signed in with a Firebase Auth user to successfully call the Function from your client application.

Embeddings models

The extension currently provides three options for generating text embeddings: Universal Sentence Encoder (USE) from TensorFlow Hub, the PaLM Text Embeddings API (models/embedding-gecko-001), or any GraphDev-based TF JS model in a GCS bucket.

There are several important differences, so make sure you pick an option which suits your use-case:

  • Speed: currently the PaLM endpoint does not allow batch processing, so the backfill process will take longer. Choose the USE model if you would like the extension to run on a pre-existing collection with many documents (>10K) already.
  • Dimensions: the PaLM model embeds to a space of dimension 768, whereas the USE model embeds to a space of 512. Larger dimension indexes will cost more on Vertex AI but also can capture more features.
  • Memory: models from TensorFlow Hub will be loaded into Function memory, whereas PaLM provides an API. Large models may require you to increase the memory from the default (512MB), which can incur additional Functions costs.

Additional Setup

Cloud Audit Log access

First, before installing the extension, you need to enable data read & write access in Cloud Audit Log for Vertex AI API. The instructions to enable are as follows:

  • Visit this page and ensure that you have selected the project you’d like to install this extension in, using the project picker.
  • Filter for “Vertex AI API” and click on the checkbox next to it. A new panel should appear on the right side of the page.
  • On the new panel, click on the checkboxes next to “Data Read” and “Data Write”, and click Save.

PaLM API access (optional)

If you would like to use the PaLM embeddings model, you will first need to apply for access to the PaLM API via this waitlist.

Once you have access, please enable the Generative Language API in your Google Cloud Project before installing this extension.

Cloud Firestore and Cloud Storage setup

Make sure that you've set up a Cloud Firestore database and enabled Cloud Storage in your Firebase project.

After installation, you will need to also add some security rules on a new Firestore collection created by the extension that is used to store internal backfill state. Please check the extension instance after installation for more details.

Installation time

Note that the extension itself will take ~2h to finish installing & processing, with a minimum of 40 minutes to create the Index, 60 mins to deploy the Index, and the rest of time to backfill existing data (optional). The total runtime will depend on how large your existing dataset is.

Billing

To install an extension, your project must be on the Blaze (pay as you go) plan. You will be charged a small amount (typically around $0.01/month) for the Firebase resources required by this extension (even if it is not used). This extension uses other Firebase and Google Cloud Platform services, which have associated charges if you exceed the service's no-cost tier:

  • Cloud Firestore
  • Cloud Storage
  • Cloud Run
  • Cloud EventArc
  • Vertex AI
  • Cloud Functions (See FAQs)

Learn more about Firebase billing.

Additionally, this extension uses the PaLM API, which is currently in public preview. During the preview period, developers can try the PaLM API at no cost. Pricing will be announced closer to general availability. For more information on the PaLM API public preview, see the PaLM API documentation.

⚠️ Note: The extension does not delete the Matching Engine Index automatically when you uninstall the extension.

Vertex AI charges by node hour when hosting a Matching Engine Index, so your project will continue to incur costs until you manually undeploy the index. Instructions for undeploying an index are available here.

You can read more about Matching Engine pricing here.

Configuration Parameters:

Cloud Functions:

  • backfillTrigger: Sets up the Vertex Matching Engine index and creates a private connection to it.

  • updateIndexConfig: Updates the configuration of the Vertex Matching Engine index.

  • backfillTask: A task-triggered function that gets called before a Vertex Matching Engine index is created. It backfills embeddings for all documents in the collection.

  • createIndexTrigger: An event-triggered function that gets called when a special metadata document updated. It checks the status of the backfill every time, and once it's done it will trigger index creation.

  • streamUpdateDatapoint: An event-triggered function that gets called when a document is created or updated. It generates embeddings for the document and updates the Matching Engine index.

  • streamRemoveDatapoint: An event-triggered function that gets called when a document is deleted. It deleted the document's datapoint from the Matching Engine index.

  • datapointWriteTask: A task-triggered function that gets scheduled when a new write operation is detected but the index isn't ready. It generates embeddings for the document and updates the Matching Engine index.

  • queryIndex: A function that queries the Vertex Matching Engine index.

Other Resources:

  • onIndexCreated (firebaseextensions.v1beta.v2function)

  • onIndexDeployed (firebaseextensions.v1beta.v2function)

APIs Used:

  • aiplatform.googleapis.com (Reason: Powers Vertex Matching Engine)

  • eventarc.googleapis.com (Reason: Powers all events and triggers)

  • run.googleapis.com (Reason: Powers v2 Cloud Functions)

  • storage-component.googleapis.com (Reason: Needed to use Cloud Storage)

Access Required:

This extension will operate with the following project IAM roles:

  • datastore.user (Reason: This extension requires read/write access to Firestore.)

  • storage.admin (Reason: This extension requires write access to Cloud Storage to create a bucket and upload embeddings files to it as part of the backfill.)

  • aiplatform.user (Reason: This extension requires access to Vertex AI to create, update and query a Vertex Matching Engine index.)