Stars
A Gradio web UI for Large Language Models with support for multiple inference backends.
Agent S: an open agentic framework that uses computers like a human
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
This repository offers a comprehensive collection of tutorials and implementations for Prompt Engineering techniques, ranging from fundamental concepts to advanced strategies. It serves as an essen…
Machine Learning Engineering Open Book
A modular graph-based Retrieval-Augmented Generation (RAG) system
A playbook for systematically maximizing the performance of deep learning models.
Fast and memory-efficient exact attention
Practical GPU Sharing Without Memory Size Constraints
A high-throughput and memory-efficient inference and serving engine for LLMs
Build high-quality LLM apps - from prototyping, testing to production deployment and monitoring.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Get up and running with Llama 3.3, Phi 4, Gemma 2, and other large language models.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
Read and extract text and other content from PDFs in C# (port of PDFBox)
File upload vulnerability scanner and exploitation tool.
Integrate cutting-edge LLM technology quickly and easily into your apps
A webrtc interface wrapped in dart language.