Stars
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
Medical o1, Towards medical complex reasoning with LLMs
Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
A collection of original, innovative ideas and algorithms towards Advanced Literate Machinery. This project is maintained by the OCR Team in the Language Technology Lab, Tongyi Lab, Alibaba Group.
NVIDIA Ingest is an early access set of microservices for parsing hundreds of thousands of complex, messy unstructured PDFs and other enterprise documents into metadata and text to embed into retri…
Sky-T1: Train your own O1 preview model within $450
Search-o1: Agentic Search-Enhanced Large Reasoning Models
A comprehensive library for implementing LLMs, including a unified training pipeline and comprehensive model evaluation.
Modeling, training, eval, and inference code for OLMo
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception
A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Train a 1B LLM with 1T tokens from scratch by personal
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Build resilient language agents as graphs.
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
Empowering RAG with a memory-based data interface for all-purpose applications!
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…