cczw2010

awen cczw2010

一入前端深似海

21 followers · 73 following

http://www.cnblogs.com/cczw/

Achievements

Starred repositories

WEIFENG2333 / VideoCaptioner

🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理！- A powered tool for easy and efficient video subtitling.

Python 4,701 385 Updated Feb 17, 2025

timsainb / noisereduce

Noise reduction in python using spectral gating (speech, bioacoustics, audio, time-domain signals)

Jupyter Notebook 1,570 241 Updated Dec 28, 2024

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,727 506 Updated Feb 20, 2025

Ola-Omni / Ola

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 257 9 Updated Feb 23, 2025

NexaAI / nexa-sdk

Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (…

Python 4,391 619 Updated Feb 21, 2025

snakers4 / silero-models

Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

Jupyter Notebook 5,148 328 Updated Oct 18, 2023

snakers4 / silero-vad

Silero VAD: pre-trained enterprise-grade Voice Activity Detector

Python 5,062 492 Updated Feb 17, 2025

microsoft / OmniParser

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 16,828 1,278 Updated Feb 23, 2025

SkyworkAI / SkyReels-V1

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,390 117 Updated Feb 21, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 39,021 5,844 Updated Feb 24, 2025

baichuan-inc / Baichuan-7B

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Python 5,694 507 Updated Jul 18, 2024

harishsg993010 / LLM-Reasoner

Make any LLM to think like OpenAI o1 and deepseek R1

Python 411 22 Updated Feb 6, 2025

resemble-ai / resemble-enhance

AI powered speech denoising and enhancement

Python 1,650 182 Updated Dec 3, 2024

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 2,276 166 Updated Feb 14, 2025

stepfun-ai / Step-Audio

Python 3,204 242 Updated Feb 21, 2025

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 3,619 253 Updated Feb 14, 2025

SuperKogito / Voice-based-gender-recognition

🔉 👦 👧Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)

Python 210 67 Updated Jul 6, 2023

espeak-ng / espeak-ng

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

C 4,708 962 Updated Jan 31, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,318 720 Updated Feb 23, 2025

FireRedTeam / FireRedASR

FireRedASR is a family of open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outs…

Python 588 39 Updated Feb 17, 2025