Stars
NVSim - A performance, energy and area estimation tool for non-volatile memory (NVM)
Transformer related optimization, including BERT, GPT
Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"
CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups
Multiplication using AVX512 and AVX512IFMA instructions
An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.
Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption
HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizati…
This is the development repository for the OpenFHE library. The current (stable) version is v1.2.3 (released on October 30, 2024).
CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture
Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Verilog to Routing -- Open Source CAD Flow for FPGA Research
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Envision a future where every student can read all the code of a teaching operating system.
Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)
LaTeX samples for NSF Research.gov Proposal Submission. For more information about Research.gov Proposal Submission visit https://www.research.gov/research-web/content/aboutpsm Feedback [email protected]
Acceleration of Identity block of NanoReviser neural network
Artifacts Repository for ECE2195 Project
FPGA-based acceleration of transposed convolution