Stars
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
4 bits quantization of LLaMA using GPTQ
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Code and documentation to train Stanford's Alpaca models, and generate the data.
LLaMA: Open and Efficient Foundation Language Models
Benchmark API for Multidomain Language Modeling
Progressive Prompts: Continual Learning for Language Models
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Vision Transformer for 3D medical image registration (Pytorch).
To eventually become an unofficial Pytorch implementation / replication of Alphafold2, as details of the architecture get released
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale)
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
Transformer seq2seq model, program that can build a language translator from parallel corpus
A small package to create visualizations of PyTorch execution graphs
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Python .mu reader/writer and blender import/export addon
Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet…
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
PyTorch implementation of Mixer-nano (#parameters is 0.67M, originally Mixer-S/16 has 18M) with 90.83 % acc. on CIFAR-10. Training from scratch.
Implements MLP-Mixer (https://arxiv.org/abs/2105.01601) with the CIFAR-10 dataset.
Let's train vision transformers (ViT) for cifar 10!
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch