This is a quick demo of showing how to create an LLM-powered PDF Q&A application using LangChain and Meta Llama 2. It uses all-mpnet-base-v2 for embedding, and Meta Llama-2-7b-chat for question answering.
demo.webm
- PDF ingestion and chunking.
- use
PyMuPDF
to extract texts (blocks) from PDF file. - Each chunk
- consists of one or more PDF blocks. The default minimum chunk length is 1000 chars.
- metadata contains starting page number and the bounding boxes of the contained blocks.
- use bounding box to highlight a block.
- use
- use Chroma as the embedding database.
- similarity search results (chunks) are passed to LLM as context for answering a user question.
To build a Docker image:
docker build --tag pdfchat .
To start a container:
# with GPU
docker run --init -p 8501:8501 --gpus all pdfchat
# no GPU
docker run --init -p 8501:8501 pdfchat
Then view the application in a browser: http://localhost:8501