- Vancouver, Canada
Highlights
- Pro
Stars
Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
Libriheavy: a 50,000 hours ASR corpus with punctuation casing and context
A python script that allows your terminal to snow.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
Chronos: Pretrained Models for Probabilistic Time Series Forecasting
PII Masker is an open-source tool for protecting sensitive data by automatically detecting and masking PII using advanced AI, powered by DeBERTa-v3. It provides high-precision detection, scalable p…
Windows' "Active Windows" watermark for Linux
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Omni SenseVoice: High-Speed Speech Recognition with words timestamps 🗣️🎯
Multilingual Voice Understanding Model
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
[ICLR 2025] SOTA discrete acoustic codec models with 40/75 tokens per second for audio language modeling
Efficient Triton Kernels for LLM Training
Evaluate your LLM's response with Prometheus and GPT4 💯
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
Awesome speech/audio LLMs, representation learning, and codec models
UI Library for Design Engineers. Animated components and effects you can copy and paste into your apps. Free. Open Source.
Minimalist developer portfolio using Next.js 14, React, TailwindCSS, Shadcn UI and Magic UI
Official implement of paper "AutoScraper: A Progressive Understanding Web Agent for Web Scraper Generation" [EMNLP 24']
aider is AI pair programming in your terminal