Skip to content
View shruti-singh's full-sized avatar

Highlights

  • Pro

Block or report shruti-singh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…

Jupyter Notebook 11,217 1,134 Updated Jan 16, 2025

Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy

Python 982 48 Updated Jan 16, 2025

Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain

Python 255 41 Updated Oct 18, 2023

Benchmark for Brain Computer Interface methods

Python 14 7 Updated Jan 3, 2025

A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)

523 28 Updated Dec 12, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 38,306 5,016 Updated Jan 22, 2025

Code/data for MARG (multi-agent review generation)

Python 38 3 Updated Nov 14, 2024

DSIR large-scale data selection framework for language model training

Python 242 19 Updated Apr 7, 2024

High accuracy RAG for answering questions from scientific documents with citations

Python 6,803 672 Updated Jan 18, 2025

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)

Python 11 Updated Aug 24, 2024

Turn expensive prompts into cheap fine-tuned models

TypeScript 2,527 134 Updated May 25, 2024

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.

Python 4,820 268 Updated Dec 16, 2024

Minimal Python library to connect to LLMs (OpenAI, Anthropic, Google, Groq, Reka, Together, AI21, Cohere, Aleph Alpha, HuggingfaceHub), with a built-in model performance benchmark.

Python 732 46 Updated Dec 28, 2024

Minimal Python library to connect to LLMs (OpenAI, Anthropic, AI21, Cohere, Aleph Alpha, HuggingfaceHub, Google PaLM2, with a built-in model performance benchmark.

Python 1 Updated Oct 1, 2023

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 38,235 5,475 Updated Jan 22, 2025

A Python library for OpenAlex (openalex.org)

Python 190 28 Updated Jan 20, 2025

Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.

209 31 Updated Jan 15, 2025

Examples and guides for using the OpenAI API

MDX 61,295 9,824 Updated Jan 22, 2025

Data and tools for generating and inspecting OLMo pre-training data.

Python 1,062 117 Updated Jan 15, 2025

Pretraining Efficiently on S2ORC!

Python 149 4 Updated Oct 23, 2024

Aligned, Review-Informed Edits of Scientific Papers

Python 49 1 Updated Jul 5, 2023

🦜🔗 Build context-aware reasoning applications

Jupyter Notebook 98,752 16,058 Updated Jan 22, 2025

🦙 Integrating LLMs into structured NLP pipelines

Python 1,171 92 Updated Jan 8, 2025

Open source codebase powering the HuggingChat app

TypeScript 7,924 1,171 Updated Jan 22, 2025

⚡ Automating scientific workflows with AI ⚡

Python 378 38 Updated Aug 15, 2024

Awesome-LLM: a curated list of Large Language Model

20,877 1,706 Updated Jan 13, 2025

OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA

Python 301 34 Updated Jun 13, 2023
Python 84 8 Updated May 14, 2024
Next