-
King Abdullah University of Science and Technology,Univercity of Macau,UfishAI
- Jeddah
-
20:59
(UTC +08:00) - https://www.notion.so/shuyhere/Shu-Yang-1210f14e46e080f18511e448279487e6?pvs=4
- @shuyhere
Highlights
- Pro
Lists (6)
Sort Name ascending (A-Z)
Stars
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
The hub for EleutherAI's work on interpretability and learning dynamics
MIXQ: Taming Dynamic Outliers in Mixed-Precision Quantization by Online Prediction
Mixed precision inference by Tensorrt-LLM
Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering
Tools for understanding how transformer predictions are built layer-by-layer
SGLang is a fast serving framework for large language models and vision language models.
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Video Question Answering via Gradually Refined Attention over Appearance and Motion
Large language models to generate stable crystals.
🔥 Omni large models and datasets for understanding and generating multi-modalities.
Open source replication of Anthropic's Crosscoders for Model Diffing
The AI developer platform. Use Weights & Biases to train and fine-tune models, and manage models from experimentation to production.
Everything about the SmolLM & SmolLM2 family of models
Toolkit for attaching, training, saving and loading of new heads for transformer models
Training SAEs for your LLM, and visualize it in one place
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
A modern, high customizable, responsive Jekyll theme for documentation with built-in search.
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..
Collection of Reverse Engineering in Large Model