Skip to content
View huangyz0918's full-sized avatar

Organizations

@gsoc-cn @msra-alumni @MLSysOps

Block or report huangyz0918

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 12,978 1,567 Updated Jan 28, 2025

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 4,775 467 Updated Jan 27, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 714 29 Updated Sep 21, 2024

Make websites accessible for AI agents

Python 22,031 2,143 Updated Jan 31, 2025

A PyTorch native library for large model training

Python 3,222 258 Updated Jan 31, 2025

Neural Networks: Zero to Hero

Jupyter Notebook 13,065 1,796 Updated Aug 18, 2024

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,330 453 Updated Jan 28, 2025

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,772 526 Updated Dec 14, 2024

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

Python 930 100 Updated Jan 2, 2025

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,144 494 Updated May 3, 2024

Efficient Triton Kernels for LLM Training

Python 4,261 249 Updated Jan 30, 2025

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,270 1,086 Updated Jan 31, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 1,901 190 Updated Jan 31, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 8,645 1,517 Updated Jan 30, 2025

RAG that intelligently adapts to your use case, data, and queries

Python 2,810 140 Updated Jan 22, 2025

Making Long-Context LLM Inference 10x Faster and 10x Cheaper

Python 408 40 Updated Jan 30, 2025

Large Language Model Text Generation Inference

Python 9,670 1,133 Updated Jan 31, 2025

Robust Speech Recognition via Large-Scale Weak Supervision

Python 75,348 9,009 Updated Jan 4, 2025

A fully open-sourced alternative NotebookLM

3 Updated Oct 17, 2024

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 18,347 1,922 Updated Oct 15, 2024

MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering

Python 602 70 Updated Jan 14, 2025

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

Python 1,362 57 Updated Jan 31, 2025

An open-source RAG-based tool for chatting with your documents.

Python 20,675 1,604 Updated Jan 27, 2025

A 3DGS framework for omni urban scene reconstruction and simulation.

Python 697 68 Updated Sep 6, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 8,239 800 Updated Jan 31, 2025

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 21,997 2,180 Updated Jan 31, 2025

Set of tools to assess and improve LLM security.

Python 2,862 473 Updated Jan 29, 2025

The Memory layer for AI Agents

Python 24,261 2,251 Updated Jan 31, 2025
Next