Skip to content
View AriesLL's full-sized avatar

Organizations

@UCLA-VAST

Block or report AriesLL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

NVSim - A performance, energy and area estimation tool for non-volatile memory (NVM)

C++ 109 51 Updated Aug 27, 2018

Transformer related optimization, including BERT, GPT

C++ 6,003 899 Updated Mar 27, 2024

Contains the source code examples described in the "Intel® 64 and IA-32 Architectures Optimization Reference Manual"

Assembly 787 91 Updated May 3, 2024

Apple AMX Instruction Set

C 1,037 50 Updated Dec 26, 2024

CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups

Cuda 206 59 Updated Sep 25, 2024

Multiplication using AVX512 and AVX512IFMA instructions

Assembly 23 6 Updated Nov 9, 2015

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

SystemVerilog 391 40 Updated Mar 11, 2023

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

C++ 224 51 Updated Oct 26, 2024

HElib is an open-source software library that implements homomorphic encryption. It supports the BGV scheme with bootstrapping and the Approximate Number CKKS scheme. HElib also includes optimizati…

C++ 3,173 769 Updated Aug 1, 2024

This is the development repository for the OpenFHE library. The current (stable) version is v1.2.3 (released on October 30, 2024).

C++ 817 205 Updated Feb 6, 2025

Models and examples built with hls4ml

C++ 12 3 Updated Apr 8, 2020

CHARM: Composing Heterogeneous Accelerators on Heterogeneous SoC Architecture

C++ 127 22 Updated Jan 13, 2025

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform

Python 10,806 1,155 Updated Jun 30, 2023

Verilog to Routing -- Open Source CAD Flow for FPGA Research

C++ 1,049 401 Updated Feb 6, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,548 4,225 Updated Feb 6, 2025

Envision a future where every student can read all the code of a teaching operating system.

C 2,239 156 Updated Feb 6, 2025
C++ 3 Updated Mar 23, 2023

XLS: Accelerated HW Synthesis

C++ 1,233 182 Updated Feb 6, 2025

An FHE compiler for C++

C++ 3,543 259 Updated Sep 4, 2024

EB1A Full Application - I-140 and I-485

TeX 264 103 Updated Nov 20, 2023
C++ 5 Updated May 30, 2023
C++ 8 2 Updated Oct 14, 2019
1 Updated Jun 6, 2022

Reference implementation of the Transformer architecture optimized for Apple Neural Engine (ANE)

Python 2,587 86 Updated Apr 25, 2023

LaTeX samples for NSF Research.gov Proposal Submission. For more information about Research.gov Proposal Submission visit https://www.research.gov/research-web/content/aboutpsm Feedback [email protected]

TeX 235 64 Updated Dec 7, 2023
Python 1 Updated Apr 29, 2022

Acceleration of Identity block of NanoReviser neural network

C++ 1 Updated Apr 29, 2022

Artifacts Repository for ECE2195 Project

Python 1 Updated Apr 27, 2022

FPGA-based acceleration of transposed convolution

C++ 4 Updated Apr 29, 2022

Modular hardware build system

Python 912 88 Updated Feb 6, 2025
Next