Skip to content
View BinWang28's full-sized avatar

Highlights

  • Pro

Organizations

@USC-MCL @emnlp-2023 @SeaEval @NLGPerson

Block or report BinWang28

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 834 133 Updated Jul 18, 2024

Robust Speech Recognition via Large-Scale Weak Supervision

Python 68,953 8,115 Updated Sep 30, 2024

My attempt at reproducing the paper Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection

Jupyter Notebook 392 106 Updated Dec 24, 2022

[ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia

JavaScript 143 14 Updated Jul 30, 2024
Python 6,041 451 Updated Oct 4, 2024

Llama3.1 learns to Listen

Python 155 5 Updated Oct 7, 2024

SOTA discrete acoustic codec models with 40 tokens per second for audio language modeling

Python 686 39 Updated Sep 21, 2024

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Python 175 10 Updated Oct 2, 2024

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,450 304 Updated Jan 4, 2024

Audio captioning recipe

Python 41 4 Updated Jun 22, 2024

Translation models for 22 scheduled languages of India

Python 220 60 Updated Aug 28, 2024

LLaST: Improved End-to-end Speech Translation System Leveraged by Large Language Models

Python 12 1 Updated Aug 11, 2024

MuChoMusic is a benchmark for evaluating music understanding in multimodal audio-language models.

Jupyter Notebook 22 1 Updated Aug 9, 2024

A simple yet powerful tool to turn traditional container/OS images into unprivileged sandboxes.

Shell 626 94 Updated Aug 8, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,126 68 Updated Aug 13, 2024

Multilingual Voice Understanding Model

Python 2,839 268 Updated Sep 25, 2024

The official Meta Llama 3 GitHub site

Python 26,504 2,995 Updated Aug 12, 2024

Metrics for evaluating Automated Audio Captioning systems, designed for PyTorch.

Python 33 3 Updated Jun 28, 2024

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

476 33 Updated Sep 6, 2024

Audio Large Language Models

72 3 Updated Oct 4, 2024

AudioBench: A Universal Benchmark for Audio Large Language Models

Python 76 1 Updated Sep 17, 2024

Awesome speech/audio LLMs, representation learning, and codec models

622 28 Updated Sep 24, 2024

Moonshot - A simple and modular tool to evaluate and red-team any LLM application.

Python 161 33 Updated Oct 7, 2024

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,785 108 Updated Jul 29, 2024

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 8,680 545 Updated Oct 2, 2024

Evaluate your LLM's response with Prometheus and GPT4 💯

Python 766 47 Updated Sep 9, 2024

Container plugin for Slurm Workload Manager

C 282 31 Updated Jul 31, 2024

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Python 1,975 138 Updated Oct 3, 2024

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Python 31,946 3,917 Updated Oct 7, 2024

ACL 2024 Workshop: CRAFT: Extracting and Tuning Cultural Instructions from the Wild

Python 2 Updated Aug 3, 2024
Next