Hannibal046

Follow

🎯

Focusing

Hannibal046

🎯

Focusing

Follow

Truth and Freedom | 真实与自由

378 followers · 137 following

Achievements

Achievements

Stars

hemingkx / Spec-Bench

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Python 209 22 Updated Oct 25, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,045 1,043 Updated Dec 26, 2024

deepseek-ai / DeepSeek-V3

Python 11,552 788 Updated Dec 31, 2024

IBM / recurrent-chunked-models-regular-languages

Code of "Recurrent Transformers Trade-off Parallelism for Length Generalization on Regular Languages"

Python 7 1 Updated Nov 13, 2024

foundation-model-stack / bamba

Train, tune, and infer Bamba model

Python 67 11 Updated Dec 20, 2024

jerber / lang-jepa

Python 89 4 Updated Dec 23, 2024

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 615 63 Updated Aug 22, 2024

huggingface / picotron_tutorial

Python 33 6 Updated Dec 20, 2024

AnswerDotAI / ModernBERT

Bringing BERT into modernity via both architecture changes and scaling

Python 838 44 Updated Dec 21, 2024

huggingface / picotron

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 274 18 Updated Dec 20, 2024

facebookresearch / memory

Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, sparsely activated memory layers complement compute-heavy dense f…

Python 79 4 Updated Dec 12, 2024

facebookresearch / blt

Code for BLT research paper

Python 1,202 83 Updated Dec 12, 2024

NLPJCL / RAG-Retrieval

Unify Efficient Fine-tuning of RAG Retrieval, including Embedding, ColBERT, ReRanker.

Python 592 49 Updated Dec 29, 2024

bentrevett / clip-search

Text-to-image search with OpenCLIP, Docker, Flask, Faiss, etc. and a basic front-end.

Python 4 Updated Apr 27, 2024

NVlabs / GatedDeltaNet

Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 57 6 Updated Dec 31, 2024

Infini-AI-Lab / TriForce

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 240 14 Updated Aug 31, 2024

allenai / awesome-open-source-lms

Friends of OLMo and their links.

221 14 Updated Dec 15, 2024

Infini-AI-Lab / Sequoia

scalable and robust tree-based speculative decoding algorithm

Python 324 37 Updated Aug 13, 2024

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

533 25 Updated Dec 30, 2024

FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,362 164 Updated Jun 25, 2024

sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.

Python 6,937 635 Updated Dec 31, 2024

waterhorse1 / Natural-language-RL

Natural Language Reinforcement Learning

Python 64 7 Updated Dec 19, 2024

Caiyun-AI / DCFormer

Python 187 15 Updated Dec 22, 2024

SafeAILab / EAGLE

Official Implementation of EAGLE-1 (ICML'24) and EAGLE-2 (EMNLP'24)

Python 887 91 Updated Dec 30, 2024

gaogaotiantian / viztracer

A debugging and profiling tool that can trace and visualize python code execution

Python 5,611 407 Updated Dec 5, 2024

facebookresearch / xformers

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 8,848 633 Updated Dec 31, 2024

Zhen-Tan-dmml / LLM4Annotation

305 15 Updated Dec 26, 2024

mlflow / mlflow

Open source platform for the machine learning lifecycle

Python 19,120 4,299 Updated Dec 31, 2024

smart-lty / ParallelSpeculativeDecoding

The official code for paper "parallel speculative decoding with adaptive draft length."

Python 30 1 Updated Aug 23, 2024

NVlabs / hymba

Python 137 10 Updated Dec 11, 2024