This is a fork of chat-langchain. It takes data from https://aisafety.info and uses it to create a chatbot that can answer questions about AI safety.
- Install dependencies:
pip install -r requirements.txt
- Download this folder and extract it into a folder called
aisafety.info
- Run
ingest.sh
to ingest the data inaisafety.info
- Run the app:
make start
- To enable tracing, make sure
langchain-server
is running locally and passtracing=True
toget_chain
inmain.py
. You can find more documentation here.
- To enable tracing, make sure
- Open localhost:9000 in your browser.
There are two components: ingestion and question-answering.
Ingestion has the following steps:
- Pull documents from Google Drive
- Load documents with LangChain's document loaders
- Split documents with LangChain's TextSplitter
- Create a vectorstore of embeddings, using LangChain's vectorstore wrapper (with OpenAI's embeddings and FAISS vectorstore).
Question-Answering has the following steps, all handled by ChatVectorDBChain:
- Given the chat history and new user input, determine what a standalone question would be (using GPT-3).
- Given that standalone question, look up relevant documents from the vectorstore.
- Pass the standalone question and relevant documents to GPT-3 to generate a final answer.