pgai simplifies the process of building search, Retrieval Augmented Generation (RAG), and other AI applications with PostgreSQL.
The goal of pgai is to make working with AI easier and more accessible to developers. Because data is the foundation of most AI applications, pgai makes it easier to leverage your data in AI workflows. In particular, pgai supports:
Working with embeddings generated from your data:
- Automatically create and sync vector embeddings for your data (learn more)
- Search your data using vector and semantic search (learn more)
- Implement Retrieval Augmented Generation inside a single SQL statement (learn more)
Leverage LLMs for data processing tasks:
- Retrieve LLM chat completions from models like Claude Sonnet 3.5, OpenAI GPT4o, Cohere Command, and Llama 3 (via Ollama). (learn more)
- Reason over your data and facilitate use cases like classification, summarization, and data enrichment on your existing relational data in PostgreSQL (see an example).
Learn more about pgai: To learn more about the pgai extension and why we built it, read pgai: Giving PostgreSQL Developers AI Engineering Superpowers.
Contributing: We welcome contributions to pgai! See the Contributing page for more information.
Here's how to get started with pgai:
For a quick start, try out automatic data embedding using pgai Vectorizer:
- Try our cloud offering by creating a free trial account and heading over to our pgai Vectorizer documentation.
- or check out our quick start guide to get up and running in less than 10 minutes with a self-hosted Postgres instance.
For other use cases, first Install pgai in Timescale Cloud, a pre-built Docker image, or from source. Then, choose your own adventure:
- Automate AI embedding with pgai Vectorizer.
- Use pgai to integrate AI from your provider. Some examples:
- Ollama - configure pgai for Ollama, then use the model to embed, chat complete and generate.
- OpenAI - configure pgai for OpenAI, then use the model to tokenize, embed, chat complete and moderate. This page also includes advanced examples.
- Anthropic - configure pgai for Anthropic, then use the model to generate content.
- Cohere - configure pgai for Cohere, then use the model to tokenize, embed, chat complete, classify, and rerank.
- Leverage LLMs for data processing tasks such as classification, summarization, and data enrichment (see the OpenAI example).
The fastest ways to run PostgreSQL with the pgai extension are to:
-
Create your database environment. Either:
Run the TimescaleDB Docker image, then enable the pgai extension.
pgai is available for new or existing Timescale Cloud services. For any service, enable the pgai extension.
To install pgai from source on a PostgreSQL server:
-
Install the prerequisite software system-wide
-
Python3: if running
python3 --version
in Terminal returnscommand not found
, download and install the latest version of Python3. -
Pip: if running
pip --version
in Terminal returnscommand not found
:-
Standard installation: use one of the pip supported methods.
-
Virtual environment: usually, pip is automatically installed if you are working in a Python virtual environment. If you are running PostgreSQL in a virtual environment, pgai requires several python packages. Set the
PYTHONPATH
andVIRTUAL_ENV
environment variables before you start your PostgreSQL server.PYTHONPATH=/path/to/venv/lib/python3.12/site-packages \ VIRTUAL_ENV=/path/to/venv \ pg_ctl -D /path/to/data -l logfile start
-
-
PL/Python: follow How to install Postgres 16 with plpython3u: Recipes for macOS, Ubuntu, Debian, CentOS, Docker.
macOS: the standard PostgreSQL brew in Homebrew does not include the
plpython3
extension. These instructions show how to install from an alternate tap.-
Postgresql plugin for the asdf version manager: set the
--with-python
option when installing PostgreSQL:POSTGRES_EXTRA_CONFIGURE_OPTIONS=--with-python asdf install postgres 16.3
-
-
pgvector: follow the install instructions from the official repository.
These extensions are automatically added to your PostgreSQL database when you Enable the pgai extension.
-
-
Make this
pgai
extension:make install
-
Connect to your database with a postgres client like psql v16 or PopSQL.
psql -d "postgres://<username>:<password>@<host>:<port>/<database-name>"
-
Create the pgai extension:
CREATE EXTENSION IF NOT EXISTS ai CASCADE;
The
CASCADE
automatically installspgvector
andplpython3u
extensions.
The main features in pgai are:
Working with embeddings generated from your data:
- Automatically create and sync vector embeddings for your data
- Search your data using vector and semantic search
- Implement Retrieval Augmented Generation inside a single SQL statement
Leverage LLMs for data processing tasks: You can use pgai to integrate AI from the following providers:
Learn how to moderate content directly in the database using triggers and background jobs.
The pgvector and pgvectorscale extensions allow you to store vector embeddings in your database and perform fast and efficient vector search. The pgai Vectorizer builds on top of these extensions to automatically create and synchronize embeddings for any text data in your database.
With one line of code, you can define a vectorizer that creates embeddings for data in a table:
SELECT ai.create_vectorizer(
<table_name>::regclass,
destination => <embedding_table_name>,
embedding => ai.embedding_openai(<model_name>, <dimensions>),
chunking => ai.chunking_recursive_character_text_splitter(<column_name>)
);
This newly created vectorizer will automatically track any changes to the data in the source table and update the destination embedding table with the new embeddings asynchronously.
Automate AI embedding with pgai Vectorizer shows you how to implement embeddings in your own data. When you create Vectorizers in a Timescale Cloud database, embeddings are automatically created and synchronized in the background. On a self-hosted Postgres installation, you use a Vectorizer Worker to asynchronously processes your vectorizers.
pgai exposes a set of functions to directly interact with the LLM models through SQL, enabling you to do semantic search directly in your database:
SELECT
chunk,
embedding <=> ai.openai_embed(<embedding_model>, 'some-query') as distance
FROM <embedding_table>
ORDER BY distance
LIMIT 5;
This is a perfectly normal SQL query. You can combine it with where
clauses and other SQL features to
further refine your search. pgai solves the missing where clause in vector search problem for real.
Similar to semantic search, pgai LLM functions enable you to implement RAG directly in your database. For example:
-
Create a RAG function:
CREATE OR REPLACE FUNCTION generate_rag_response(query_text TEXT) RETURNS TEXT AS $$ DECLARE context_chunks TEXT; response TEXT; BEGIN -- Perform similarity search to find relevant blog posts SELECT string_agg(title || ': ' || chunk, ' ') INTO context_chunks FROM ( SELECT title, chunk FROM blogs_embedding ORDER BY embedding <=> ai.openai_embed('text-embedding-3-small', query_text) LIMIT 3 ) AS relevant_posts; -- Generate a summary using gpt-4o-mini SELECT ai.openai_chat_complete( 'gpt-4o-mini', jsonb_build_array( jsonb_build_object('role', 'system', 'content', 'You are a helpful assistant. Use only the context provided to answer the question. Also mention the titles of the blog posts you use to answer the question.'), jsonb_build_object('role', 'user', 'content', format('Context: %s\n\nUser Question: %s\n\nAssistant:', context_chunks, query_text)) ) )->'choices'->0->'message'->>'content' INTO response; RETURN response; END; $$ LANGUAGE plpgsql;
-
Execute your function in a SQL query:
SELECT generate_rag_response('Give me some startup advice');
pgai is still at an early stage. Now is a great time to help shape the direction of this project; we are currently deciding priorities. Have a look at the list of features we're thinking of working on. Feel free to comment, expand the list, or hop on the Discussions forum.
To get started, take a look at how to contribute and how to set up a dev/test environment.
Timescale is a PostgreSQL database company. To learn more visit the timescale.com.
Timescale Cloud is a high-performance, developer focused, cloud platform that provides PostgreSQL services for the most demanding AI, time-series, analytics, and event workloads. Timescale Cloud is ideal for production applications and provides high availability, streaming backups, upgrades over time, roles and permissions, and great security.