-
King Abdullah University of Science and Technology,Univercity of Macau,UfishAI
- Jeddah
-
15:49
(UTC +08:00) - https://www.notion.so/shuyhere/Shu-Yang-1210f14e46e080f18511e448279487e6?pvs=4
- @shuyhere
Highlights
- Pro
research
A series of large language models trained from scratch by developers @01-ai
Locating and editing factual associations in GPT (NeurIPS 2022)
Influence Analysis and Estimation - Survey, Papers, and Taxonomy
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
Codebase for Merging Language Models (ICML 2024)
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
[SIGIR'24] The official implementation code of MOELoRA.
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Customizable implementation of the self-instruct paper.
Mixture of Expert (MoE) techniques for enhancing LLM performance through expert-driven prompt mapping and adapter combinations.
Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型
Feeling confused about super alignment? Here is a reading list
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
An Efficient "Factory" to Build Multiple LoRA Adapters
Instruct-tune LLaMA on consumer hardware
Development repository for the Triton language and compiler
2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
Accessible large language models via k-bit quantization for PyTorch.
Benchmark baseline for retrieval qa applications