Skip to content
View cczw2010's full-sized avatar

Block or report cczw2010

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.

Python 4,701 385 Updated Feb 17, 2025

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

Jupyter Notebook 1,570 241 Updated Dec 28, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,727 506 Updated Feb 20, 2025

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 257 9 Updated Feb 23, 2025

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (…

Python 4,391 619 Updated Feb 21, 2025

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook 5,148 328 Updated Oct 18, 2023

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,062 492 Updated Feb 17, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 16,828 1,278 Updated Feb 23, 2025

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,390 117 Updated Feb 21, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 39,021 5,844 Updated Feb 24, 2025

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Python 5,694 507 Updated Jul 18, 2024

Make any LLM to think like OpenAI o1 and deepseek R1

Python 411 22 Updated Feb 6, 2025

AI powered speech denoising and enhancement

Python 1,650 182 Updated Dec 3, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,276 166 Updated Feb 14, 2025
Python 3,204 242 Updated Feb 21, 2025

A fast multimodal LLM for real-time voice

Python 3,619 253 Updated Feb 14, 2025

🔉 👦 👧Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)

Python 210 67 Updated Jul 6, 2023

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 4,708 962 Updated Jan 31, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,318 720 Updated Feb 23, 2025

FireRedASR is a family of open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outs…

Python 588 39 Updated Feb 17, 2025

Everything you need to build state-of-the-art foundation models, end-to-end.

Python 7,372 529 Updated Feb 24, 2025

The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

Python 2,849 188 Updated Feb 24, 2025

Inference and training library for high-quality TTS models.

Python 5,037 526 Updated Dec 10, 2024

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 1,263 123 Updated Jul 15, 2024

Gradio WebUI for audio processing, powered by Whisper (OpenAI-Whisper, Faster-Whisper, Whisper-Timestamped). Features Voice Changer(RVC), zero-shot Voice Cloning (E2, F5-TTS, CosyVoice), YouTube do…

Python 3,334 249 Updated Feb 24, 2025

Witness the aha moment of VLM with less than $3.

Python 2,813 219 Updated Feb 21, 2025

Align Anything: Training All-modality Model with Feedback

Python 2,245 320 Updated Feb 19, 2025

Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.5 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.0 - Superior Quality! 🔌 OpenAI-Compatible. 🌊 Strea…

Python 1,489 324 Updated Feb 23, 2025

A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.

Rust 4,177 327 Updated Feb 4, 2025

FastAPI service on top of WhisperX

Python 68 16 Updated Jan 31, 2025
Next
Showing results