Stars
A Survey on Data Selection for Language Models
Summarize existing representative LLMs text datasets.
A quick guide (especially) for trending instruction finetuning datasets
The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.
The official GitHub page for the survey paper "A Survey of Large Language Models".
Collection of training data management explorations for large language models
A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)
✨✨Latest Advances on Multimodal Large Language Models
The open-source project for "Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading"[MobiCom'2022]