Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,780 494 Updated Feb 24, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 39,146 5,863 Updated Feb 24, 2025

fishaudio / fish-speech

SOTA Open Source TTS

Python 19,473 1,506 Updated Feb 18, 2025

facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation

Jupyter Notebook 11,346 1,113 Updated Nov 14, 2024

Open-LLM-VTuber / Open-LLM-VTuber

Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking face running locally across platforms

Python 2,368 237 Updated Feb 24, 2025

kyutai-labs / moshi

Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec.

Python 7,539 604 Updated Feb 24, 2025

ABexit / ASR-LLM-TTS

This is a speech interaction system built on an open-source model, integrating ASR, LLM, and TTS in sequence. The ASR model is SenceVoice, the LLM models are QWen2.5-0.5B/1.5B, and there are three …

Python 427 84 Updated Jan 12, 2025

livekit / livekit

End-to-end stack for WebRTC. SFU media server and SDKs.

Go 11,678 1,010 Updated Feb 24, 2025

FreeIPCC / FreeIPCC

呼叫中心，智能外呼，大模型呼入机器人，大模型呼出机器人，客服系统，工单系统，开源呼叫中心系统，话务系统，智能外呼系统，智能电话外呼，呼叫中心系统，大模型客服，电话外呼，客服中心，在线客服，大模型呼叫中心，呼入机器人，大模型机器人，智能电话外呼，开源呼叫中心系统，电话外呼，在线客服，大模型callcenter,contactcenter,Call，IPCC，Customer Service，V…

26 4 Updated Feb 24, 2025

CompVis / stable-diffusion

A latent text-to-image diffusion model

Jupyter Notebook 69,685 10,337 Updated Jun 18, 2024

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,638 1,332 Updated Feb 21, 2025

datawhalechina / self-llm

《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调（全参数/Lora）、部署国内外开源大模型（LLM）/多模态大模型（MLLM）教程

Jupyter Notebook 13,006 1,498 Updated Feb 19, 2025

DS4SD / docling

Get your documents ready for gen AI

Python 22,003 1,251 Updated Feb 24, 2025

SoccerNet / sn-spotting

Repository containing all necessary codes to get started on the SoccerNet Action Spotting challenge. This repository also contains several benchmark methods.

Python 69 9 Updated Feb 7, 2024

liguodongiot / llm-action

本项目旨在分享大模型相关技术原理以及实战经验（大模型工程化、大模型应用落地）

HTML 14,506 1,677 Updated Feb 12, 2025

milvus-io / milvus

Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search

Go 32,661 3,039 Updated Feb 24, 2025

hiroi-sora / Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片，PDF文档识别，排除水印/页眉页脚，扫描/生成二维码。内置多国语言库。

Python 29,749 2,970 Updated Feb 9, 2025

smile-wingbow / pudding-robot

C++ 80 10 Updated Jan 21, 2025

langfuse / langfuse

🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with LlamaIndex, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 8,784 801 Updated Feb 24, 2025

run-llama / llama_index

LlamaIndex is the leading framework for building LLM-powered agents over your data.

Python 39,245 5,594 Updated Feb 24, 2025

langchain-ai / rag-from-scratch

Jupyter Notebook 3,479 1,036 Updated Jul 9, 2024

HKUDS / LightRAG

"LightRAG: Simple and Fast Retrieval-Augmented Generation"

Python 12,137 1,691 Updated Feb 24, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,021 1,076 Updated Feb 16, 2025

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,683 216 Updated Dec 5, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 3,168 275 Updated Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sky-walker

Block or report sky-walker

Starred repositories

unslothai / unsloth

QwenLM / Qwen2.5-VL

microsoft / OmniParser

stepfun-ai / Step-Audio

ollama / ollama

modelscope / ms-swift