-
ZTE
- China
ai
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
Notebooks for learning deep learning
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
The official PyTorch implementation of Google's Gemma models
OpenAI Triton backend for Intel® GPUs
A throughput-oriented high-performance serving framework for LLMs
FlashInfer: Kernel Library for LLM Serving
Minimalistic large language model 3D-parallelism training
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
Felafax is building AI infra for non-NVIDIA GPUs
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
You like pytorch? You like micrograd? You love tinygrad! ❤️
how to optimize some algorithm in cuda.
Distribute and run LLMs with a single file.
Productive, portable, and performant GPU programming in Python.
ncnn is a high-performance neural network inference framework optimized for the mobile platform
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Quick, visual, principled introduction to pytorch code through five colab notebooks.
A library to analyze PyTorch traces.
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Examples of how to use PyTorch's TensorIterator in C++
A PyTorch native library for large model training
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
Accessible large language models via k-bit quantization for PyTorch.
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Open Thoughts: Fully Open Data Curation for Thinking Models