Skip to content
View cahya-wirawan's full-sized avatar
💭
building a bunch of transformers based indonesian language models
💭
building a bunch of transformers based indonesian language models

Block or report cahya-wirawan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Query Only Linear Adapter Training for Fine Tuned Embedding Model Query Representation

Jupyter Notebook 14 2 Updated Sep 12, 2024

State-of-the-Art Text Embeddings

Python 16,147 2,560 Updated Mar 5, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 962 50 Updated Mar 6, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,879 601 Updated Mar 5, 2025

This repository contains the Hugging Face Agents Course.

Jupyter Notebook 13,801 830 Updated Mar 5, 2025

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

TypeScript 3,184 295 Updated Mar 6, 2025

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 372 28 Updated Feb 3, 2025

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 8,413 590 Updated Mar 4, 2025

YuE: Open Full-song Music Generation Foundation Model, something similar to Suno.ai but open

Python 4,255 454 Updated Mar 1, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,601 2,175 Updated Feb 1, 2025

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 2 Updated Feb 5, 2025

Codec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis

Python 215 23 Updated Mar 6, 2025

Taming Stable Diffusion for Lip Sync!

Python 2,818 419 Updated Jan 19, 2025

A lightweight Python library for running TTS models with a unified API.

Python 17 1 Updated Feb 18, 2025

Official implementation of the TTS model Lina-Speech

Jupyter Notebook 157 12 Updated Jan 9, 2025

A demo of Cache-Augmented Generation (CAG) in an LLM

Jupyter Notebook 44 7 Updated Jan 1, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 18,846 1,351 Updated Mar 3, 2025

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Python 687 76 Updated Mar 4, 2025

Cache-Augmented Generation: A Simple, Efficient Alternative to RAG

Python 1,065 157 Updated Feb 16, 2025

A fast multimodal LLM for real-time voice

Python 3,680 265 Updated Feb 14, 2025

Bringing BERT into modernity via both architecture changes and scaling

Python 1,254 93 Updated Feb 20, 2025

GLM-4-Voice | 端到端中英语音对话模型

Python 2,721 222 Updated Dec 5, 2024

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".

Python 8,186 635 Updated Dec 27, 2024

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,407 1,645 Updated Feb 26, 2025

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 31 3 Updated Sep 8, 2024

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,558 123 Updated Aug 13, 2024

Speech, Language, Audio, Music Processing with Large Language Model

Python 742 69 Updated Mar 6, 2025

Python tool for converting files and office documents to Markdown.

Python 39,569 1,840 Updated Mar 6, 2025
Next