Skip to content
View shruti-singh's full-sized avatar

Highlights

  • Pro

Block or report shruti-singh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 16,234 1,611 Updated May 11, 2025

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

Python 1,161 67 Updated May 20, 2025

Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain

Python 271 44 Updated Oct 18, 2023

Benchmark for Brain Computer Interface methods

Python 16 7 Updated Feb 1, 2025

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)

576 32 Updated Feb 26, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 49,893 7,203 Updated Apr 20, 2025

Code/data for MARG (multi-agent review generation)

Python 43 5 Updated Nov 14, 2024

DSIR large-scale data selection framework for language model training

Python 249 19 Updated Apr 7, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 7,369 723 Updated May 21, 2025

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)

Python 11 Updated Aug 24, 2024

Turn expensive prompts into cheap fine-tuned models

TypeScript 2,593 143 Updated May 25, 2024

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 5,102 294 Updated Mar 11, 2025

Minimal Python library to connect to LLMs (OpenAI, Anthropic, Google, Groq, Reka, Together, AI21, Cohere, Aleph Alpha, HuggingfaceHub), with a built-in model performance benchmark.

Python 770 52 Updated May 22, 2025

Minimal Python library to connect to LLMs (OpenAI, Anthropic, AI21, Cohere, Aleph Alpha, HuggingfaceHub, Google PaLM2, with a built-in model performance benchmark.

Python 1 Updated Oct 1, 2023

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 41,799 5,969 Updated May 23, 2025

A Python library for OpenAlex (openalex.org)

Python 237 30 Updated Apr 7, 2025

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.

227 33 Updated Jan 24, 2025

Examples and guides for using the OpenAI API

MDX 64,221 10,508 Updated May 21, 2025

Data and tools for generating and inspecting OLMo pre-training data.

Python 1,219 142 Updated May 22, 2025

Pretraining Efficiently on S2ORC!

Python 164 5 Updated Oct 23, 2024

Aligned, Review-Informed Edits of Scientific Papers

Python 52 1 Updated Jul 5, 2023

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 107,981 17,578 Updated May 22, 2025

🦙 Integrating LLMs into structured NLP pipelines

Python 1,251 96 Updated Jan 8, 2025

Open source codebase powering the HuggingChat app

TypeScript 8,732 1,316 Updated May 21, 2025

⚡ Automating scientific workflows with AI ⚡

Python 385 38 Updated Aug 15, 2024

Awesome-LLM: a curated list of Large Language Model

23,452 1,962 Updated May 9, 2025

OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA

Python 302 34 Updated Jun 13, 2023
Python 90 8 Updated May 14, 2024
Next