Stars
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Fully open reproduction of DeepSeek-R1
Enforce the output format (JSON Schema, Regex etc) of a language model
A collection of architectural patterns leveraging Large Language Models (LLMs) for efficient Text-to-SQL generation.
An open-source OCR API that leverages OpenAI's powerful language models with optimized performance techniques like parallel processing and batching to deliver high-quality text extraction from comp…
A throughput-oriented high-performance serving framework for LLMs
Pretrain, finetune and serve LLMs on Intel platforms with Ray
Curated list of datasets and tools for post-training.
Official inference repo for FLUX.1 models
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
llama3.np is a pure NumPy implementation for Llama 3 model.
Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
llama3 implementation one matrix multiplication at a time
A massively parallel, high-level programming language
Implementation of popular ML algorithms from scratch
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
Evaluate the accuracy of LLM generated outputs
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/