s3nh

🦊

s3nh s3nh

🦊

I do not care at all.

113 followers · 58 following

Achievements

x2 x2

Achievements

x2 x2

Stars

harvardnlp / var-attn

Latent Alignment and Variational Attention

Python 327 60 Updated Nov 5, 2018

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 4,607 365 Updated Mar 1, 2025

Fancy-MLLM / R1-Onevision

R1-onevision, a visual language model capable of deep CoT reasoning.

266 5 Updated Feb 28, 2025

slp-rl / slamkit

SlamKit is an open source tool kit for efficient training of SpeechLMs. It was used for "Slamming: Training a Speech Language Model on One GPU in a Day"

Python 135 5 Updated Feb 27, 2025

AlpinDale / mergekit-LGPL

Forked from arcee-ai/mergekit

Tools for merging pretrained large language models.

Python 6 1 Updated Feb 5, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,429 383 Updated Feb 28, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA Decoding Kernel for Hopper GPUs

C++ 10,762 704 Updated Feb 27, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 894 42 Updated Feb 28, 2025

KellerJordan / Muon

Muon optimizer: +>30% sample efficiency with <3% wallclock overhead

Python 428 23 Updated Feb 26, 2025

Clearbox-AI / clearbox-synthetic-kit

Clearbox AI's all-in-one solution for generation and evaluation of synthetic tabular and time-series data.

Jupyter Notebook 40 1 Updated Feb 27, 2025

mostly-ai / mostlyai-engine

Synthetic Data Engine 💎

Python 44 Updated Feb 28, 2025

lucidrains / native-sparse-attention-pytorch

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

Python 488 18 Updated Feb 28, 2025

google-gemini / cookbook

Examples and guides for using the Gemini API

Jupyter Notebook 10,816 1,305 Updated Feb 27, 2025

lucidrains / deep-cross-attention

Implementation of the proposed DeepCrossAttention by Heddes et al at Google research, in Pytorch

Python 76 4 Updated Feb 24, 2025

HarleyCoops / smolThinker-.5B

A Qwen .5B reasoning model trained on OpenR1-Math-220k

Jupyter Notebook 10 Updated Feb 26, 2025

lsdefine / simple_GRPO

A very simple GRPO implement for reproducing r1-like LLM thinking.

Python 606 42 Updated Feb 28, 2025

louaaron / Score-Entropy-Discrete-Diffusion

[ICML 2024 Best Paper] Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution (https://arxiv.org/abs/2310.16834)

Python 494 52 Updated Feb 29, 2024

pranjalssh / fast.cu

Fastest kernels written from scratch

Cuda 184 24 Updated Feb 15, 2025

ML-GSAI / LLaDA

Official PyTorch implementation for "Large Language Diffusion Models"

Python 749 27 Updated Feb 26, 2025

PCrnjak / PAROL6-Desktop-robot-arm

BOM, STL files and instructions for PAROL6 3D printed robot arm

HTML 1,599 170 Updated Feb 2, 2025

Yuan-ManX / Zonos-PyTorch

PyTorch implementation of Zonos.

Python 5 Updated Feb 12, 2025

freqtrade / freqtrade

Free, open source crypto trading bot

Python 36,810 7,206 Updated Feb 28, 2025

Kedreamix / Awesome-Talking-Head-Synthesis

💬 An extensive collection of exceptional resources dedicated to the captivating world of talking face synthesis! ⭐ If you find this repo useful, please give it a star! 🤩

1,000 54 Updated Feb 10, 2025

kvcache-ai / ktransformers

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 11,952 777 Updated Mar 1, 2025

InternLM / lmdeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,721 502 Updated Feb 27, 2025

Zyphra / Zonos

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 5,671 570 Updated Feb 18, 2025

ZihanWang314 / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 946 67 Updated Feb 28, 2025

albertan017 / LLM4Decompile

Reverse Engineering: Decompiling Binary Code with Large Language Models

Python 5,161 348 Updated Oct 28, 2024

simplescaling / s1

s1: Simple test-time scaling

Python 5,765 655 Updated Feb 23, 2025

seal-rg / recurrent-pretraining

Pretraining code for a large-scale depth-recurrent language model

Python 649 53 Updated Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

s3nh s3nh

Sponsors

Achievements

Achievements

Block or report s3nh

Stars

harvardnlp / var-attn

deepseek-ai / 3FS

Fancy-MLLM / R1-Onevision

slp-rl / slamkit

AlpinDale / mergekit-LGPL

deepseek-ai / DeepGEMM

deepseek-ai / FlashMLA

hiyouga / EasyR1

KellerJordan / Muon

Clearbox-AI / clearbox-synthetic-kit

mostly-ai / mostlyai-engine

lucidrains / native-sparse-attention-pytorch

google-gemini / cookbook

lucidrains / deep-cross-attention

HarleyCoops / smolThinker-.5B

lsdefine / simple_GRPO

louaaron / Score-Entropy-Discrete-Diffusion

pranjalssh / fast.cu

ML-GSAI / LLaDA

PCrnjak / PAROL6-Desktop-robot-arm

Yuan-ManX / Zonos-PyTorch

freqtrade / freqtrade

Kedreamix / Awesome-Talking-Head-Synthesis

kvcache-ai / ktransformers

InternLM / lmdeploy

Zyphra / Zonos

ZihanWang314 / RAGEN

albertan017 / LLM4Decompile

simplescaling / s1

seal-rg / recurrent-pretraining