-
Xidian Universtiy
- Beijing, China
-
22:43
(UTC +08:00) - [email protected]
Stars
STONNE: A Simulation Tool for Neural Networks Engines
Repository to host and maintain scale-sim-v2 code
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
A tool to modify ONNX models in a visualization fashion, based on Netron and Flask.
[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
The official code of the paper "PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction".
A curated list for Efficient Large Language Models
Ramulator 2.0 is a modern, modular, extensible, and fast cycle-accurate DRAM simulator. It provides support for agile implementation and evaluation of new memory system designs (e.g., new DRAM stan…
Here are some implementations of basic hardware units in RTL language (verilog for now), which can be used for area/power evaluation and support the hardware design tradeoff.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, in…
✨✨Latest Advances on Multimodal Large Language Models
SMAUG: Simulating Machine Learning Applications Using Gem5-Aladdin
This is the top-level repository for the Accel-Sim framework.
ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference
A Fast and Extensible DRAM Simulator, with built-in support for modeling many different DRAM technologies including DDRx, LPDDRx, GDDRx, WIOx, HBMx, and various academic proposals. Described in the…
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Collect some IC textbooks for learning.
SpotServe: Serving Generative Large Language Models on Preemptible Instances
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Open deep learning compiler stack for Kendryte AI accelerators ✨
Open-source high-performance RISC-V processor
A high-throughput and memory-efficient inference and serving engine for LLMs