-
I2R
- Singapore
-
22:19
(UTC +08:00) - binwang.xyz
- https://orcid.org/0000-0001-9760-8343
Highlights
- Pro
Stars
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
Robust Speech Recognition via Large-Scale Weak Supervision
My attempt at reproducing the paper Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection
[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia
SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling
PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Translation models for 22 scheduled languages of India
LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models
MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.
A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Multilingual Voice Understanding Model
Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…
AudioBench: A Universal Benchmark for Audio Large Language Models
Awesome speech/audio LLMs, representation learning, and codec models
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Evaluate your LLM's response with Prometheus and GPT4 💯
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)
ACL 2024 Workshop: CRAFT: Extracting and Tuning Cultural Instructions from the Wild