cahya-wirawan

💭

building a bunch of transformers based indonesian language models

Cahya Wirawan cahya-wirawan

💭

building a bunch of transformers based indonesian language models

System engineer, currently working on NLP, CV and Speech Recognition for fun and curiosity

318 followers · 47 following

Vienna, Austria
https://www.linkedin.com/in/cahyawirawan/
@CahyaWr

Achievements

x2 x2

Achievements

x2 x2

Lists (3)

Sort

GAN

Machine Learning

3 repositories

Speech Synthesis

Stars

ALucek / linear-adapter-embedding

Query Only Linear Adapter Training for Fine Tuned Embedding Model Query Representation

Jupyter Notebook 14 2 Updated Sep 12, 2024

UKPLab / sentence-transformers

State-of-the-Art Text Embeddings

Python 16,147 2,560 Updated Mar 5, 2025

ML-GSAI / LLaDA

Official PyTorch implementation for "Large Language Diffusion Models"

Python 965 50 Updated Mar 6, 2025

Zyphra / Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,879 603 Updated Mar 5, 2025

huggingface / agents-course

This repository contains the Hugging Face Agents Course.

Jupyter Notebook 13,812 834 Updated Mar 5, 2025

jina-ai / node-DeepResearch

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

TypeScript 3,186 295 Updated Mar 6, 2025

SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 373 28 Updated Feb 3, 2025

QwenLM / Qwen2.5-VL

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,415 591 Updated Mar 4, 2025

multimodal-art-projection / YuE

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 4,257 454 Updated Mar 1, 2025

deepseek-ai / Janus

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,602 2,175 Updated Feb 1, 2025

malaysia-ai / StyleTTS2-MS

Forked from yl4579/StyleTTS2

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 2 Updated Feb 5, 2025

zhenye234 / X-Codec-2.0

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 215 23 Updated Mar 6, 2025

deepseek-ai / DeepSeek-R1

85,217 10,990 Updated Feb 24, 2025

bytedance / LatentSync

Taming Stable Diffusion for Lip Sync!

Python 2,819 419 Updated Jan 19, 2025

s-sakti / data_indsp_news_tts

Lex 5 2 Updated Sep 17, 2022

fakerybakery / simpletts

A lightweight Python library for running TTS models with a unified API.

Python 17 1 Updated Feb 18, 2025

theodorblackbird / lina-speech

Official implementation of the TTS model Lina-Speech

Jupyter Notebook 157 12 Updated Jan 9, 2025

ronantakizawa / cacheaugmentedgeneration

A demo of Cache-Augmented Generation (CAG) in an LLM

Jupyter Notebook 44 7 Updated Jan 1, 2025

OpenBMB / MiniCPM-o

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,847 1,351 Updated Mar 3, 2025

sunnynexus / Search-o1

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Python 687 76 Updated Mar 4, 2025

hhhuang / CAG

Cache-Augmented Generation: A Simple, Efficient Alternative to RAG

Python 1,066 157 Updated Feb 16, 2025

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 3,680 265 Updated Feb 14, 2025

AnswerDotAI / ModernBERT

Bringing BERT into modernity via both architecture changes and scaling

Python 1,255 93 Updated Feb 20, 2025

THUDM / GLM-4-Voice

GLM-4-Voice | 端到端中英语音对话模型

Python 2,721 222 Updated Dec 5, 2024

microsoft / TRELLIS

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".

Python 8,187 635 Updated Dec 27, 2024

deepseek-ai / DeepSeek-VL2

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,407 1,646 Updated Feb 26, 2025

Lollipop / Qwen2-Audio

Forked from QwenLM/Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 31 3 Updated Sep 8, 2024