Skip to content
View af-74413592's full-sized avatar

Block or report af-74413592

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

DeepEP: an efficient expert-parallel communication library

Cuda 6,677 528 Updated Feb 28, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,465 390 Updated Feb 28, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,565 82 Updated Feb 22, 2025

minimal-cost for training 0.5B R1-Zero

Python 565 74 Updated Feb 26, 2025

生成扩散模型的Keras实现

Python 268 27 Updated Feb 14, 2025

Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...

Python 54 6 Updated Feb 26, 2025

Dense Dilated Convolutions Merging Network for Semantic Segmentation

Python 16 3 Updated Mar 6, 2020

DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code

Python 4,405 729 Updated May 2, 2023

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)

Python 235 30 Updated Feb 3, 2022

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,910 504 Updated Mar 1, 2025

Fully open data curation for reasoning models

Python 1,400 120 Updated Feb 23, 2025

The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.

Python 2,974 195 Updated Mar 1, 2025

[arXiv 2024] Generalizable Humanoid Manipulation with 3D Diffusion Policies. Part 1: Train & Deploy of iDP3

Python 218 20 Updated Feb 19, 2025

[RSS 2024] 3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations

Python 702 70 Updated Feb 28, 2025

Fully open reproduction of DeepSeek-R1

Python 21,805 1,932 Updated Mar 1, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,808 1,386 Updated Feb 1, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 3,986 364 Updated Mar 1, 2025
Python 2,262 157 Updated Feb 24, 2025

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,464 2,164 Updated Feb 1, 2025

Unofficial implementation of Titans, SOTA memory for transformers, in Pytorch

Python 1,145 95 Updated Feb 12, 2025

Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"

Python 295 27 Updated Nov 19, 2024

Stanford NLP Python library for Representation Finetuning (ReFT)

Python 1,429 124 Updated Feb 6, 2025

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 1,132 72 Updated Jul 14, 2024
Python 130 10 Updated Feb 28, 2025
Python 377 9 Updated Dec 5, 2024

Unofficial implementation of "Simplifying, Stabilizing & Scaling Continuous-Time Consistency Models" for MNIST

Python 30 3 Updated Feb 28, 2025
Python 184 17 Updated Apr 18, 2024

An Open Large Reasoning Model for Real-World Solutions

Python 1,462 77 Updated Nov 28, 2024
Next