Lists (1)
Sort Name ascending (A-Z)
Stars
A reading list for homomorphic encryption
A library for lattice-based multiparty homomorphic encryption in Go
Everything you want to know about Google Cloud TPU
A simple and elegant Jekyll theme for an academic personal homepage
INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
Create a mobile Balatro app from your Steam version of Balatro
A TensorFlow Implementation of the Transformer: Attention Is All You Need
fastllm是c++实现,后端无依赖(仅依赖CUDA,无需依赖PyTorch)的高性能大模型推理库。 可实现单4090推理DeepSeek R1 671B INT4模型,单路可达20+tps。
A high-throughput and memory-efficient inference and serving engine for LLMs
Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.
Gitbook about quadcopter and crazepony.
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
This is the source code of the 2021 replication for ReScience of the paper "Speedup Graph Processing by Graph Ordering" by Hao Wei, Jeffrey Xu Yu, Can Lu, and Xuemin Lin, published in Proceedings o…
Fast and memory-efficient exact attention
Implementation of a Tensor Processing Unit for embedded systems and the IoT.
An OpenCL-based FPGA Accelerator for Convolutional Neural Networks
用C++实现一个简单的Transformer模型。 Attention Is All You Need。
【入门项目】这个仓库是用hls来实现手写数字识别CNN硬件(xilinx fpga)加速的代码
使用HLS设计一个可分卷积(High Level Synthesis)模块,以在FPGA上对其进行加速。