Skip to content
View eqy's full-sized avatar
💭
damn that's crazy
💭
damn that's crazy

Block or report eqy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Staging ground for release notes for PyTorch

2 9 Updated Apr 18, 2025
1 Updated Feb 21, 2025

Stack trace visualizer

Perl 4 1 Updated Sep 2, 2020

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 146 21 Updated Apr 18, 2025

Implementation for MatMul-free LM.

Python 2,986 186 Updated Nov 5, 2024

Time zone database and code

C 1,624 229 Updated Apr 1, 2025
OCaml 5 1 Updated Apr 17, 2025

World's Smallest Nintendo Wii, using a trimmed motherboard and custom stacked PCBs

C 718 10 Updated Mar 31, 2025
Python 441 30 Updated Apr 6, 2025

GTP engine and self-play learning in Go

C++ 3,908 599 Updated Apr 10, 2025

graph generation and analysis stuff

1 Updated Feb 26, 2024

Zero Bubble Pipeline Parallelism

Python 383 22 Updated Apr 7, 2025

A Pin

2 Updated Aug 1, 2023

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 11,989 1,213 Updated Apr 17, 2025
TeX 1 Updated Aug 27, 2024

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 28,593 3,300 Updated Apr 19, 2025

The official implementation of “Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training”

Python 957 54 Updated Jan 30, 2024
TypeScript 2 Updated May 17, 2023

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in bot…

Cuda 2 Updated Sep 29, 2022

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…

Python 488 70 Updated Apr 17, 2025

Composable + Tunable = Optimal

Python 2 Updated Apr 14, 2023

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm

C 9,004 319 Updated Mar 29, 2025

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 317 56 Updated Apr 19, 2025

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,305 569 Updated Oct 28, 2024

CUDA Kernel Benchmarking Library

Cuda 620 74 Updated Apr 18, 2025

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,523 912 Updated Apr 7, 2025

An open-source efficient deep learning framework/compiler, written in python.

Python 698 59 Updated Feb 25, 2025
Next