Skip to content
View asu-gkg's full-sized avatar

Block or report asu-gkg

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

(MSc Thesis) Investigating the application of gradient compression techniques used to speed up distributed deep learning

Python 4 Updated Aug 19, 2020

SIDCo is An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Python 9 3 Updated Jun 6, 2021

Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '23)

Python 14 3 Updated Sep 21, 2023

GRACE - GRAdient ComprEssion for distributed deep learning

Python 135 44 Updated Jul 23, 2024

Practical low-rank gradient compression for distributed optimization: https://arxiv.org/abs/1905.13727

Python 142 32 Updated Sep 3, 2024

Managed collective communication service

Rust 10 1 Updated Sep 2, 2024

a distributed deep learning platform

C++ 3,355 1,242 Updated Sep 17, 2024

Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for LLMs

Python 69 5 Updated Jun 14, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 250 9 Updated Oct 10, 2024

Outline Server, developed by Jigsaw. The Outline Server is a proxy server that runs a Shadowsocks instance and provides a REST API for access key management.

TypeScript 5,795 782 Updated Oct 9, 2024

[CVPR 2023] DepGraph: Towards Any Structural Pruning

Python 2,647 329 Updated Oct 7, 2024

Efficient RPCs for datacenter networks

C++ 852 138 Updated May 9, 2024

AIFM: High-Performance, Application-Integrated Far Memory

C 105 35 Updated Feb 28, 2023
C 132 29 Updated Mar 25, 2021

Next-generation datacenter OS built on kernel bypass to speed up unmodified code while improving platform density and security

C++ 66 7 Updated Jul 31, 2024

A web framework for building multi-user virtual reality experiences.

JavaScript 1,160 296 Updated Jul 23, 2024

[ICLR 2018] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training

Python 209 45 Updated Jul 10, 2024

Data Plane Development Kit

C 3,326 1,228 Updated Oct 10, 2024

Data Plane Development Kit

C 10 11 Updated Oct 10, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 29,068 3,333 Updated Oct 11, 2024

Ongoing research training transformer models at scale

Python 10,229 2,298 Updated Oct 11, 2024

PyTorch extensions for high performance and large scale training.

Python 3,168 279 Updated Aug 30, 2024
Python 12 3 Updated Jul 7, 2024

RedLeaf Operating System

Rust 117 8 Updated May 9, 2022

This is a code repository for pytorch c++ (or libtorch) tutorial.

C++ 732 121 Updated Nov 2, 2021

Example of tch-rs on M1

Rust 47 6 Updated Mar 19, 2024

Rust bindings for the C++ api of PyTorch.

Rust 4,249 335 Updated Oct 4, 2024

Switch ML Application

C++ 169 48 Updated Jul 15, 2022
Next