![jekyll logo](https://raw.githubusercontent.com/github/explore/80688e429a7d4ef2fca1e82350fe8e3517d3494d/topics/jekyll/jekyll.png)
-
HKUST
- HK, China
-
05:22
(UTC +08:00) - https://zhaowei-wang-nlp.github.io/
- @ZhaoweiWang4
- in/zhaowei-wang-571943221
Highlights
- Pro
Lists (2)
Sort Name ascending (A-Z)
Starred repositories
Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Code for paper "Super-CLEVR: A Virtual Benchmark to Diagnose Domain Robustness in Visual Reasoning"
🤗 smolagents: a barebones library for agents. Agents write python code to call tools and orchestrate other agents.
Witness the aha moment of VLM with less than $3.
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
A collection of LogitsProcessors to customize and enhance LLM behavior for specific tasks.
Human preference data for "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback"
Entropy Based Sampling and Parallel CoT Decoding
Open Thoughts: Fully Open Data Curation for Thinking Models
Fully open reproduction of DeepSeek-R1
Code for NAACL 2021 full paper "Efficient Attentions for Long Document Summarization"
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external memory/knowledge augmented MLLM.
Multi-LexSum is an abstractive summarization dataset for US Civil Rights Lawsuits
The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"
LOFT: A 1 Million+ Token Long-Context Benchmark
Adaptable tools to make reinforcement learning and evolutionary computation algorithms.
SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images (AAAI2023)
[ARXIV'25] GameFactory: Creating New Games with Generative Interactive Videos
Development kit for the data of the Places365-Standard and Places365-Challenge
Large Concept Models: Language modeling in a sentence representation space
Sky-T1: Train your own O1 preview model within $450