Highlights
- Pro
Stars
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Large medical text dataset curated for abbreviation disambiguation, designed for natural language understanding pre-training in the medical domain
Benchmark for Brain Computer Interface methods
A Comprehensive Survey of Scientific Large Language Models and Their Applications in Scientific Discovery (EMNLP'24)
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Code/data for MARG (multi-agent review generation)
DSIR large-scale data selection framework for language model training
High accuracy RAG for answering questions from scientific documents with citations
Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding (Findings of EMNLP'23)
Turn expensive prompts into cheap fine-tuned models
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Minimal Python library to connect to LLMs (OpenAI, Anthropic, Google, Groq, Reka, Together, AI21, Cohere, Aleph Alpha, HuggingfaceHub), with a built-in model performance benchmark.
soldni / pyllms
Forked from kagisearch/pyllmsMinimal Python library to connect to LLMs (OpenAI, Anthropic, AI21, Cohere, Aleph Alpha, HuggingfaceHub, Google PaLM2, with a built-in model performance benchmark.
LlamaIndex is the leading framework for building LLM-powered agents over your data.
Public space for the user community of Semantic Scholar APIs to share scripts, report issues, and make suggestions.
Examples and guides for using the OpenAI API
Data and tools for generating and inspecting OLMo pre-training data.
🦜🔗 Build context-aware reasoning applications
🦙 Integrating LLMs into structured NLP pipelines
Open source codebase powering the HuggingChat app
⚡ Automating scientific workflows with AI ⚡
Awesome-LLM: a curated list of Large Language Model
OpenAlpaca: A Fully Open-Source Instruction-Following Model Based On OpenLLaMA