Highlights
- Pro
Stars
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
TensorZero creates a feedback loop for optimizing LLM applications — turning production data into smarter, faster, and cheaper models.
Translate PDF, EPub, webpage, metadata, annotations, notes to the target language. Support 20+ translate services.
A Java implemented Texas holdem and short deck Solver
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.
Apply deep reinforcement learning methods including DQN, DDPG for traffic light control in simulation (discrete environment), to prove the 'Green Wave' phenomenon in intelligent traffic system.
YangletLiu / DQN_traffic_light_control
Forked from quantumiracle/Reinforcement_Learning_for_Traffic_Light_ControlX.-Y. Liu, Z. Ding, S. Borst, A. Walid. Deep reinforcement learning for intelligent transportation systems. NeurIPS Workshop on Machine Learning for Intelligent Transportation Systems, 2018.
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
"OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction"
Unlock the potential of finetuning Large Language Models (LLMs). Learn from industry expert, and discover when to apply finetuning, data preparation techniques, and how to effectively train and eva…
AirLLM 70B inference with single 4GB GPU
Open-Sora: Democratizing Efficient Video Production for All
本仓库收集AI科技领域高质量信息源。 可以起到一个同步信息源的作用,避免信息差和信息茧房。
A PalWorld Server API like minecraft bukkit, not finish yet
This is an unofficial palworld server binary distribution project that fixes some problems with the original server.
a curated list of high-quality papers on resource-efficient LLMs 🌱
Bayesian optimisation & Reinforcement Learning library developed by Huawei Noah's Ark Lab
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Interact with your documents using the power of GPT, 100% privately, no data leaks
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).
😋 A curated reading list about database systems
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
[SIGIR'24] The official implementation code of MOELoRA.
High-speed Large Language Model Serving for Local Deployment