Skip to content
View Qoboty's full-sized avatar

Block or report Qoboty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.

Python 4,179 357 Updated Jan 11, 2025

Trans Router

Python 149 25 Updated Jan 12, 2025

[arXiv 2024] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 939 101 Updated Jan 9, 2025

Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]

Python 1,665 79 Updated Oct 26, 2023

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 1,840 128 Updated Jan 12, 2025

Implementation of TTS model based on NVIDIA P-Flow TTS Paper

Python 70 7 Updated May 12, 2024

An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

Python 1,976 140 Updated Jan 13, 2025

Everything about the SmolLM & SmolLM2 family of models

Python 1,542 81 Updated Jan 7, 2025

Official repository for LTX-Video

Python 2,513 204 Updated Jan 3, 2025
Python 2,278 264 Updated Jan 10, 2025

NanoGPT (124M) in 3.4 minutes

Python 2,056 202 Updated Jan 6, 2025

proof of concept conversation orchestrator with a speech-language model

Go 16 1 Updated Oct 19, 2024

Joint speech-language model - respond directly to audio!

Python 364 34 Updated Jul 1, 2024

Unified Speech Language Model for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"(ICLR 2024)

Python 140 10 Updated Sep 14, 2023

A Survey of Spoken Dialogue Models (60 pages)

247 15 Updated Nov 28, 2024
TypeScript 32 4 Updated Aug 17, 2024

LSLM implements full duplex modeling in interactive speech language models, based on research by Ma et al. (2024). This project advances human-computer interaction through real-time spoken dialogue…

Python 56 6 Updated Dec 22, 2024

Let's finetune video generation models!

Python 354 14 Updated Dec 22, 2024

Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.

Python 180 11 Updated Aug 25, 2024

Interface for OuteTTS models.

Python 808 65 Updated Jan 8, 2025

first base model for full-duplex conversational audio

Python 1,667 110 Updated Jan 5, 2025

Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization

Python 164 11 Updated Jul 12, 2024

An Open-source Streaming High-fidelity Neural Audio Codec

Python 453 21 Updated Oct 28, 2024

Realtime Video and Audio Streaming with WebRTC and Gradio

Python 176 23 Updated Jan 10, 2025

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Python 2,170 163 Updated Dec 6, 2024

Minimalistic large language model 3D-parallelism training

Python 1,381 137 Updated Jan 13, 2025

GLM-4-Voice | 端到端中英语音对话模型

Python 2,551 206 Updated Dec 5, 2024
Python 9 1 Updated Dec 25, 2024

Fast and accurate automatic speech recognition (ASR) for edge devices

Python 2,487 127 Updated Jan 9, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,387 227 Updated Jan 10, 2025
Next