AriesLL

Aries AriesLL

Tenure-Track Assistant Professor at University of Pittsburgh, ECE; PhD'19 UCLA

20 followers · 8 following

Organizations

Stars

SEAL-UCSB / NVSim

NVSim - A performance, energy and area estimation tool for non-volatile memory (NVM)

C++ 109 51 Updated Aug 27, 2018

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,003 899 Updated Mar 27, 2024

intel / optimization-manual

Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"

Assembly 787 91 Updated May 3, 2024

corsix / amx

Apple AMX Instruction Set

C 1,037 50 Updated Dec 26, 2024

NVlabs / CGBN

CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups

Cuda 206 59 Updated Sep 25, 2024

vkrasnov / vpmadd

Multiplication using AVX512 and AVX512IFMA instructions

Assembly 23 6 Updated Nov 9, 2015

facebookresearch / deepfloat

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

SystemVerilog 391 40 Updated Mar 11, 2023

intel / hexl

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

C++ 224 51 Updated Oct 26, 2024

homenc / HElib

HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizati…

C++ 3,173 769 Updated Aug 1, 2024

openfheorg / openfhe-development

This is the development repository for the OpenFHE library. The current (stable) version is v1.2.3 (released on October 30, 2024).

C++ 817 205 Updated Feb 6, 2025

fastmachinelearning / models

Models and examples built with hls4ml

C++ 12 3 Updated Apr 8, 2020

arc-research-lab / CHARM

CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture

C++ 127 22 Updated Jan 13, 2025

databrickslabs / dolly

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Python 10,806 1,155 Updated Jun 30, 2023

verilog-to-routing / vtr-verilog-to-routing

Verilog to Routing -- Open Source CAD Flow for FPGA Research

C++ 1,049 401 Updated Feb 6, 2025

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,548 4,225 Updated Feb 6, 2025

yhzhang0128 / egos-2000

Envision a future where every student can read all the code of a teaching operating system.

C 2,239 156 Updated Feb 6, 2025

google / xls

XLS: Accelerated HW Synthesis

C++ 1,233 182 Updated Feb 6, 2025

google / fully-homomorphic-encryption

An FHE compiler for C++

C++ 3,543 259 Updated Sep 4, 2024

razvanmarinescu / EB1A

EB1A Full Application - I-140 and I-485

TeX 264 103 Updated Nov 20, 2023

apple / ml-ane-transformers

Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

Python 2,587 86 Updated Apr 25, 2023

nsf-open / nsf-proposal-latex-samples

LaTeX samples for NSF Research.gov Proposal Submission. For more information about Research.gov Proposal Submission visit https://www.research.gov/research-web/content/aboutpsm Feedback [email protected]

TeX 235 64 Updated Dec 7, 2023

JinmingZhuang / ECE2195

Python 1 Updated Apr 29, 2022

olucas98 / nano_rev

Acceleration of Identity block of NanoReviser neural network

C++ 1 Updated Apr 29, 2022

lshector / ECE2195-Spring2022

Artifacts Repository for ECE2195 Project

Python 1 Updated Apr 27, 2022

danielstumpp / transposed-conv-accel

FPGA-based acceleration of transposed convolution

C++ 4 Updated Apr 29, 2022

siliconcompiler / siliconcompiler

Modular hardware build system

Python 912 88 Updated Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly