UPSC-LLM

Answering UPSC current affairs questions using RAG LLM.

UPSC Civil Services Examination is one of the world's toughest examination. Its first stage: "Prelims" asks many questions which are related to news articles from the previous year. Millions of students prepare and appear for it.

In this project, we try to answer the prelims type questions using newspaper and conceptual articles sourced from the internet. For each question, a RAG-based implementation will first find the relevant documents and then pass them as context to an LLM to answer the queries.

First, we tried using LangChain with Cohere LLM on a static dataset which we had compiled in a Google Sheet. This was implemented on a Google Colab .ipynb notebook.

Then we use Pathway's streaming framework to process streams of data as it is scraped from the internet. This is implemented using their Docker app, and we build a docker compose layer on top of that to run our data generators.

Run

Get an OpenAI API Key, and set it inside the demo-question-answering/.env as OPENAI_API_KEY
Install Docker and Docker Compose
Run

docker compose up

and wait for the containers to spin up and start listening on port 8000.

To see statistics:

curl -X 'POST' 'http://localhost:8000/v1/statistics'

To find documents:

curl -X 'POST' \
  'http://0.0.0.0:8000/v1/retrieve' \
  -H 'accept: */*' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "Which areas are affected by Cyclone Remal?",
  "k": 4
}'

To ask questions:

curl -X 'POST' \
  'http://0.0.0.0:8000/v1/pw_ai_answer' \
  -H 'accept: */*' \
  -H 'Content-Type: application/json' \
  -d '{
  "prompt": "Which areas are affected by Cyclone Remal?"
}'

Demo

How this works

We have used the demo-question-answering from Pathway's LLM-App Example and configured it to use 2 data streams from folders. We are using OpenAI's API.

Source- Coaching: These are free UPSC current-affairs materials uploaded daily on a popular coaching's websites (Vajiram and Ravi). They include some concepts which are relevant to the current happenings. This is helpful in answering slightly conceptual questions.
Source- News: We scrape newspaper articles from The Hindu's ePaper and save them to a folder. The articles are fetched from their internal CDN and we get high quality HTML content for them. This gives us the factual and complete news for each event.

Both these data sources are built as microservices with Docker, and their output directories are mounted as data source folders for the Pathway Docker App.

The entire setup can be run by just docker compose up, and then using the inbuilt Pathway HTTP API for statistics and inference.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
demo-question-answering		demo-question-answering
source-coaching		source-coaching
source-news		source-news
.dockerignore		.dockerignore
.gitignore		.gitignore
LICENSE.md		LICENSE.md
RAG_LangChain_UPSC.ipynb		RAG_LangChain_UPSC.ipynb
README.md		README.md
demo.gif		demo.gif
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

UPSC-LLM

Run

Demo

How this works

About

Uh oh!

Releases

Packages

Languages

License

Reve75/UPSC-LLM

Folders and files

Latest commit

History

Repository files navigation

UPSC-LLM

Run

Demo

How this works

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages