This project is a LangChain-powered chatbot app designed to allow users to ask questions related to tax law. It leverages OpenAI embeddings to process and retrieve contextually relevant answers from a collection of tax law documents using a Retrieval-Augmented Generation (RAG) system. Additionally, LangSmith is used to monitor and debug every user interaction, and the system includes pytest tests for ensuring reliable performance.
- RAG System: Combines retrieval from document embeddings and OpenAI's LLM to provide accurate and context-specific answers.
- Vector Embeddings: Automatically processes and stores PDF files into embeddings using FAISS and OpenAI.
- Interactive UI: User-friendly interface built with Streamlit for seamless interaction.
- LangSmith Monitoring: Tracks and analyzes all interactions to improve the chatbot's performance and reliability.
- Test Coverage: Pytest tests ensure the RAG system functions as expected and retrieves accurate results.
Before running the application, ensure you have the following:
- Python 3.8 or later
- Required Python libraries:
streamlit
faiss-cpu
pypdf
langchain
langchain-openai
pytest
langsmith
./documentations
: Place your tax-related PDF documents here.app.py
: Main application file.create_vectordb.py
: Creates vectordb with document embeddings and save the vectordb locally.test_rag.py
: Contains pytest tests for the RAG system.
-
Clone the repository:
git clone https://github.com/BertrandConxy/Tax-Geek-AI-chatbot.git cd Tax-Geek-AI-chatbot
-
Create virtual env
python -m venv venv
-
Install the dependencies using:
pip install -r requirements.txt
-
Ensure tax-related PDF documents are in the
documentations
folder. -
Create
.env
file for credentialsOPENAI_API_KEY LANGSMITH_TRACING=True LANGSMITH_ENDPOINT LANGSMITH_API_KEY LANGSMITH_PROJECT
-
Run
create_vectordb.py
to create vector embeddings for the documents and store the db locally.python create_vectordb.py
-
Run the Streamlit app:
streamlit run app.py
LangSmith is integrated into this project to monitor and analyze chatbot interactions. This ensures the app remains robust and user-friendly. To configure LangSmith:
- Set up your LangSmith account and API key.
- Ensure the
LANGSMITH_API_KEY
is added to your environment variables.
Pytest tests are included to validate the functionality of the RAG system. To run the tests:
pytest tests_rag.py