Qoboty

Follow

Qoboty Qoboty

Follow

16 followers · 69 following

Starred repositories

google-research / timesfm

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.

Python 4,179 357 Updated Jan 11, 2025

notedit / TransRouter

Trans Router

Python 149 25 Updated Jan 12, 2025

hkchengrex / MMAudio

[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 939 101 Updated Jan 9, 2025

hkust-nlp / ceval

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,665 79 Updated Oct 26, 2023

VITA-MLLM / VITA

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 1,840 128 Updated Jan 12, 2025

seastar105 / pflow-encodec

Implementation of TTS model based on NVIDIA P-Flow TTS Paper

Python 70 7 Updated May 12, 2024

modelscope / ClearerVoice-Studio

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 1,976 140 Updated Jan 13, 2025

huggingface / smollm

Everything about the SmolLM & SmolLM2 family of models

Python 1,542 81 Updated Jan 7, 2025

Lightricks / LTX-Video

Official repository for LTX-Video

Python 2,513 204 Updated Jan 3, 2025

allenai / open-instruct

Python 2,278 264 Updated Jan 10, 2025

KellerJordan / modded-nanogpt

NanoGPT (124M) in 3.4 minutes

Python 2,056 202 Updated Jan 6, 2025

tincans-ai / gazelle-inference

proof of concept conversation orchestrator with a speech-language model

Go 16 1 Updated Oct 19, 2024

tincans-ai / gazelle

Joint speech-language model - respond directly to audio!

Python 364 34 Updated Jul 1, 2024

0nutation / USLM

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)

Python 140 10 Updated Sep 14, 2023

jishengpeng / WavChat

A Survey of Spoken Dialogue Models (60 pages)

247 15 Updated Nov 28, 2024

thunlp / duplex-model

TypeScript 32 4 Updated Aug 17, 2024

sanowl / LSLM-Listening-while-Speaking-Language-Model

LSLM implements full duplex modeling in interactive speech language models, based on research by Ma et al. (2024). This project advances human-computer interaction through real-time spoken dialogue…

Python 56 6 Updated Dec 22, 2024

VideoVerses / VideoTuna

Let's finetune video generation models!

Python 354 14 Updated Dec 22, 2024

haoheliu / SemantiCodec-inference

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Python 180 11 Updated Aug 25, 2024

edwko / OuteTTS

Interface for OuteTTS models.

Python 808 65 Updated Jan 8, 2025

Standard-Intelligence / hertz-dev

first base model for full-duplex conversational audio

Python 1,667 110 Updated Jan 5, 2025

mct10 / RepCodec

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 164 11 Updated Jul 12, 2024

facebookresearch / AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec

Python 453 21 Updated Oct 28, 2024

freddyaboulton / gradio-webrtc

Realtime Video and Audio Streaming with WebRTC and Gradio

Python 176 23 Updated Jan 10, 2025

linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,170 163 Updated Dec 6, 2024

huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Python 1,381 137 Updated Jan 13, 2025

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,551 206 Updated Dec 5, 2024

cs20s030 / ehmam

Python 9 1 Updated Dec 25, 2024

usefulsensors / moonshine

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,487 127 Updated Jan 9, 2025

facebookresearch / lingua

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,387 227 Updated Jan 10, 2025

Starred topics

word-error-rate