Skip to content
View jeffra's full-sized avatar

Highlights

  • Pro

Organizations

@brownsys

Block or report jeffra

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepSpeed FastGen

9 3 Updated Jan 29, 2025

ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)

Python 34 3 Updated Jan 31, 2025
Jupyter Notebook 47 9 Updated Nov 24, 2024

Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines

Python 196 12 Updated May 6, 2024

Machine Learning Engineering Open Book

Python 12,579 770 Updated Jan 31, 2025

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,106 1,708 Updated Jan 29, 2025

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Python 1,951 176 Updated Jan 24, 2025

Pretrained language model with 100B parameters

Python 3,748 298 Updated Jul 10, 2023

Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2

Python 2,883 562 Updated Dec 4, 2024

Code release for SLIP Self-supervision meets Language-Image Pre-training

Python 754 71 Updated Feb 9, 2023

Azure HPC/AI VM Images

Shell 100 80 Updated Jan 28, 2025

Library for 8-bit optimizers and quantization routines.

717 38 Updated Aug 18, 2022

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Python 1,358 220 Updated Mar 20, 2024

Distribution transparent Machine Learning experiments on Apache Spark

Python 90 14 Updated Feb 21, 2024

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,062 1,039 Updated Jan 29, 2025

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed

Python 437 74 Updated Jun 14, 2023

Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.

Python 472 91 Updated Oct 23, 2024

RDMA and SHARP plugins for nccl library

C 173 34 Updated Jan 22, 2025

Example models using DeepSpeed

Python 6,242 1,062 Updated Jan 30, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,442 4,218 Updated Jan 31, 2025

A minimal & modern LaTeX template for your (bachelor's | master's | doctoral) thesis

TeX 1,180 132 Updated Nov 16, 2023

Find the smallest number of switches necessary to build topologies of a given number of hosts and bisection bandwidth for the EGFT, HyperX, and Jellyfish topologies.

Python 2 Updated Jul 24, 2013