Skip to content
View s3nh's full-sized avatar
🦊
🦊

Sponsors

@CharlesCNorton

Block or report s3nh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Latent Alignment and Variational Attention

Python 327 60 Updated Nov 5, 2018

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 4,607 365 Updated Mar 1, 2025

R1-onevision, a visual language model capable of deep CoT reasoning.

266 5 Updated Feb 28, 2025

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

Python 135 5 Updated Feb 27, 2025

Tools for merging pretrained large language models.

Python 6 1 Updated Feb 5, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,429 383 Updated Feb 28, 2025

FlashMLA: Efficient MLA Decoding Kernel for Hopper GPUs

C++ 10,762 704 Updated Feb 27, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 894 42 Updated Feb 28, 2025

Muon optimizer: +>30% sample efficiency with <3% wallclock overhead

Python 428 23 Updated Feb 26, 2025

Clearbox AI's all-in-one solution for generation and evaluation of synthetic tabular and time-series data.

Jupyter Notebook 40 1 Updated Feb 27, 2025

Synthetic Data Engine 💎

Python 44 Updated Feb 28, 2025

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 488 18 Updated Feb 28, 2025

Examples and guides for using the Gemini API

Jupyter Notebook 10,816 1,305 Updated Feb 27, 2025

Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch

Python 76 4 Updated Feb 24, 2025

A Qwen .5B reasoning model trained on OpenR1-Math-220k

Jupyter Notebook 10 Updated Feb 26, 2025

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 606 42 Updated Feb 28, 2025

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)

Python 494 52 Updated Feb 29, 2024

Fastest kernels written from scratch

Cuda 184 24 Updated Feb 15, 2025

Official PyTorch implementation for "Large Language Diffusion Models"

Python 749 27 Updated Feb 26, 2025

BOM, STL files and instructions for PAROL6 3D printed robot arm

HTML 1,599 170 Updated Feb 2, 2025

PyTorch implementation of Zonos.

Python 5 Updated Feb 12, 2025

Free, open source crypto trading bot

Python 36,810 7,206 Updated Feb 28, 2025

💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩

1,000 54 Updated Feb 10, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,952 777 Updated Mar 1, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,721 502 Updated Feb 27, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,671 570 Updated Feb 18, 2025

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 946 67 Updated Feb 28, 2025

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,161 348 Updated Oct 28, 2024

s1: Simple test-time scaling

Python 5,765 655 Updated Feb 23, 2025

Pretraining code for a large-scale depth-recurrent language model

Python 649 53 Updated Feb 14, 2025
Next