Granite Retrieval Agent

The Granite Retrieval Agent is an implementation of Agentic RAG (Retrieval Augmented Generation) that can answer queries using a combination of both local document and web retrieval. It serves as a both personal productivity tool as well as an example of building an agent using task planning, adaptive step-by-step execution, and tool calling with an open source LLM such as Granite 3.1 at the helm.

For ease of use, this agent is designed to be run locally on your laptop, given sufficient processing and memory - But can be run anywhere. (Initial tests were done using a MacBook Pro with an M3 Max Chip and 64GB of RAM).

The core agent code is wrapped inside of an Open WebUI Function so that interaction with the agent can be accomplished through an easy to use chat UI.

Components

Open WebUI (Version 0.5 supported - Use the openwebui_0.4 branch for version 0.4 support)
Ollama
Searxng
The Python script of this repo - implementing an Agentic Workflow, using the AutoGen framework

Note: The script imports the PyPi package autogen which is sourced from the AG2 project, a community fork of Microsoft AutoGen. Our agent uses functionality that is compatible with either AG2 or the ~0.2 release of AutoGen.

High Level Architecture

Agent Architecture

In-Depth Tour of the Agentic Workflow Architecture

The following article describes the advantages of this multi-agent approach, as well as the architecture of the various agents and their interactions:

Build an agentic RAG system with Granite 3.1 on your laptop

Getting Started

1. Install Ollama

See Ollama's README for full installation instructions. However it is as simple as:

On OSX:

brew install ollama

On Linux:

curl -fsSL https://ollama.com/install.sh | sh

To run:

ollama serve
ollama pull granite3.1-dense:8b

Now you are up and running with Ollama and Granite

2. Install Open WebUI

pip install open-webui
open-webui serve

3. Setup SearXNG for web search

SearXNG is a metasearch engine that aggregates results from multiple search engines. The reason for it's inclusion in this architecture is that it requires no SaaS API key, as it can run directly on your laptop.

Run the SearXNG docker image:

docker run -d --name searxng -p 8888:8080 -v ./searxng:/etc/searxng --restart always searxng/searxng:latest

Note: SearXNG and Open WebUI both run on port 8080, so I've mapped SearXNG to my local machine port 8888. The reference to the ./searxng folder in the docker run command is a reference to a location on your local machine where you will need to provide some configuration files for Searxng. We recommend you use the exact configuration files provided in the Open WebUI documentation.

This agent uses the SearXNG API directory - So you do not need to follow the steps in the Open WebUI documentation to setup SearXNG in the UI of Open WebUI. It is only necessary if you want to use SearXNG via the Open WebUI interface apart from this agent.

4. Import the agent python into Open WebUI

In your browser, go to http://localhost:8080/ to access Open Web UI
If it is your first time opening the Open WebUI interface, register a user and password. (This information is kept entirely local to your machine, it does not send you emails)
After logging in, click on the icon on the lower left hand side where your user name is. This will bring a pop-up menu.
Click on Admin panel.
At the top of the menu, click on Functions
At the top right, click the + sign to add a new function.
Give the function a name and a description such as "Granite RAG Agent"
Paste the contents of granite_autogen_rag.py into the text box provided, replacing any existing content.
Click Save at the bottom of the screen.
Back on the Functions page, make sure the agent is toggled to "Enabled", as the image below
Click on the gear icon next to the enablement toggle to customize any settings such as the inference endpoint, the SearXNG endpoint or the model ID

5. Load your documents into Open WebUI

In Open WebUI, click on Workspace in the upper left hand corner
Click Knowledge at the top of the screen
Click the + sign to add a new collection.
From here, you may add one or many collections and upload any text documents that you like. These documents will be queried when you instruct the model on a task that refers to extracting knowledge from your documents.

Usage

Some example queries:

What companies are prominent adopters of the open source technologies my teams are working on?

Study my meeting notes to figure out the capabilities of the projects I’m involved in. Then, find me other open source projects that have similar feature sets.

Important Note: As of 12/25/24, Open WebUI updated their internal architecture with the 0.5 release. It has great performance improvements however, since then, we've been experiencing some issues where chat results don't show up in the browser until the browser window is refreshed.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
docs/images		docs/images
LICENSE		LICENSE
README.md		README.md
granite_autogen_rag.py		granite_autogen_rag.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Granite Retrieval Agent

Components

High Level Architecture

Agent Architecture

In-Depth Tour of the Agentic Workflow Architecture

Getting Started

1. Install Ollama

2. Install Open WebUI

3. Setup SearXNG for web search

4. Import the agent python into Open WebUI

5. Load your documents into Open WebUI

Usage

About

Releases

Contributors 4

Languages

License

ibm-granite-community/granite-retrieval-agent

Folders and files

Latest commit

History

Repository files navigation

Granite Retrieval Agent

Components

High Level Architecture

Agent Architecture

In-Depth Tour of the Agentic Workflow Architecture

Getting Started

1. Install Ollama

2. Install Open WebUI

3. Setup SearXNG for web search

4. Import the agent python into Open WebUI

5. Load your documents into Open WebUI

Usage

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 4

Languages