-
The University of HongKong
- Pokfulam, Hong Kong, PRC
-
08:47
(UTC +08:00) - https://www.zhihu.com/people/wang-jia-hao-hku
Highlights
- Pro
Lists (10)
Sort Name ascending (A-Z)
Starred repositories
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
Official implementation for the paper "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models"
Official repository of βVisual-RFT: Visual Reinforcement Fine-Tuningβ
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Large World Model -- Modeling Text and Video with Millions Context
MM-EUREKA: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning
Paper List of Inference/Test Time Scaling/Computing
[ICLR 2025] Autoregressive Video Generation without Vector Quantization
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
DeepEP: an efficient expert-parallel communication library
A very simple GRPO implement for reproducing r1-like LLM thinking.
π This is a repository for organizing papers, codes and other resources related to unified multimodal models.
π Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Fully open reproduction of DeepSeek-R1
[ICLR 2025] Reconstructive Visual Instruction Tuning
Infinity β : Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis
Model Compression Toolbox for Large Language Models and Diffusion Models
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient