yuta0306

Yubo yuta0306

16 followers · 10 following

Achievements

Organizations

Stars

inokoj / VAP-Realtime

A real-time implementation of Voice Activity Projection (VAP) is aimed at controlling behaviors of spoken dialogue systems, such as turn-taking.

Python 47 6 Updated Jan 6, 2025

huggingface / candle

Minimalist ML framework for Rust

Rust 16,356 1,005 Updated Jan 22, 2025

Gadersd / whisper-burn

A Rust implementation of OpenAI's Whisper model using the burn framework

Rust 285 36 Updated May 6, 2024

ErikEkstedt / VoiceActivityProjection

Voice Activity Projection Models: Self-supervised learning of Turn-taking Events

Python 45 13 Updated May 29, 2024

serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python

Python 17,234 2,397 Updated Jan 20, 2025

alibaba-damo-academy / SpokenNLP

A wide variety of research projects developed by the SpokenNLP team of Speech Lab, Alibaba Group.

Python 112 11 Updated Dec 20, 2024

tauri-apps / tauri

Build smaller, faster, and more secure desktop and mobile applications with a web frontend.

Rust 88,668 2,701 Updated Jan 23, 2025

huggingface / open_asr_leaderboard

Python 63 22 Updated Nov 29, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,943 2,641 Updated Jan 23, 2025

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 865 138 Updated Dec 7, 2023

declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation

Python 853 209 Updated Mar 10, 2024

VikParuchuri / marker

Convert PDF to markdown + JSON quickly with high accuracy

Python 19,492 1,163 Updated Jan 22, 2025

DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

Python 2,892 265 Updated Jun 4, 2024

Yuan-ManX / ai-audio-datasets

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

601 45 Updated Jan 15, 2025

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

13,629 872 Updated Jan 17, 2025

arielnlee / Platypus

Code for fine-tuning Platypus fam LLMs using LoRA

Python 627 60 Updated Feb 4, 2024

llm-jp / awesome-japanese-llm

日本語LLMまとめ - Overview of Japanese LLMs

TypeScript 1,076 31 Updated Jan 22, 2025

lucidrains / audiolm-pytorch

Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch

Python 2,484 271 Updated Jan 12, 2025

JaidedAI / EasyOCR

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Python 25,277 3,224 Updated Sep 24, 2024

iejMac / video2dataset

Easily create large video dataset from video urls

Python 561 67 Updated Jul 30, 2024

ditoec / openface2_ros

Forked from interaction-lab/openface_ros

ROS bindings for OpenFace 2.1.0

C++ 7 7 Updated Aug 21, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yubo yuta0306

Achievements

Achievements

Organizations

Block or report yuta0306

Stars

inokoj / VAP-Realtime

huggingface / candle

Gadersd / whisper-burn

ErikEkstedt / VoiceActivityProjection

serengil / deepface

alibaba-damo-academy / SpokenNLP

tauri-apps / tauri

huggingface / open_asr_leaderboard

NVIDIA / NeMo

facebookresearch / av_hubert

declare-lab / MELD

VikParuchuri / marker

DAMO-NLP-SG / Video-LLaMA

Yuan-ManX / ai-audio-datasets

BradyFU / Awesome-Multimodal-Large-Language-Models

arielnlee / Platypus

llm-jp / awesome-japanese-llm

lucidrains / audiolm-pytorch

JaidedAI / EasyOCR

iejMac / video2dataset

ditoec / openface2_ros

SYSTRAN / faster-whisper

chenfei-wu / TaskMatrix

rickgroen / cov-weighting

google-research / tuning_playbook

pola-rs / polars

huggingface / diffusion-models-class

zcgzcgzcg1 / ACL2022_KnowledgeNLP_Tutorial

j0306043 / jcourse-proceeding

juanmc2005 / diart