-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Insights: hpcaitech/ColossalAI
Overview
Could not load contribution data
Please try again later
19 Pull requests merged by 5 people
-
[upgrade] upgrade gpt2
#6291 merged
May 8, 2025 -
[fix] revert reward update and evaluation
#6295 merged
May 7, 2025 -
[feat] Support evaluation during training
#6290 merged
May 3, 2025 -
[feat] Update reward verification
#6292 merged
May 3, 2025 -
[feat] Sync shard model
#6289 merged
Apr 30, 2025 -
[Feat] support hybrid parallel model sync
#6288 merged
Apr 29, 2025 -
[feat] Support boxed math reward
#6284 merged
Apr 29, 2025 -
[fix] Fix tp & pp; Fix dataloader
#6280 merged
Apr 28, 2025 -
[hotfix] Fix save issue
#6279 merged
Apr 27, 2025 -
[hotfix] fix checkpoint naming; add num_epoch parameter
#6277 merged
Apr 26, 2025 -
[feat] Support DAPO
#6263 merged
Apr 25, 2025 -
Upgrade transformers
#6276 merged
Apr 24, 2025 -
[feat] Add final save at the end
#6274 merged
Apr 23, 2025 -
[feat] Add custom prompt
#6273 merged
Apr 22, 2025 -
[feat] GRPO with distributed implementation
#6230 merged
Apr 21, 2025 -
[ci] update ci
#6254 merged
Apr 18, 2025 -
Update README.md
#6268 merged
Apr 17, 2025 -
[hot-fix] Fix memory leakage bug, support TP+PP
#6258 merged
Apr 10, 2025 -
[Distributed RLHF] Integration of PP
#6257 merged
Apr 9, 2025
7 Pull requests opened by 6 people
-
[upgrade]transformers upgrade
#6275 opened
Apr 23, 2025 -
[DOC]: Update the documentation of ShardConfig for 1D, 2D, 2.5D, 3D tensor parallelism
#6278 opened
Apr 26, 2025 -
[shardformer] Upgrade transformers version: falcon model
#6283 opened
Apr 28, 2025 -
[feat] Manually schedule resources and support auto master address assigning
#6293 opened
May 3, 2025 -
[Ring Attention] Add more detailed references
#6294 opened
May 6, 2025 -
[upgrade]upgrade mistral
#6296 opened
May 7, 2025 -
[hotfix] Add more info
#6297 opened
May 8, 2025
1 Issue closed by 1 person
7 Issues opened by 5 people
-
[FEATURE]: Upgrade the transformers versions of shardformer
#6281 opened
Apr 28, 2025 -
why GeminiPlugin zero3+offloading cannot training a 7B model
#6272 opened
Apr 21, 2025 -
[BUG]: Unable to figure out how to pass env variables for each node
#6261 opened
Apr 14, 2025
6 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[BUG]: can not save model in pipeline training mode
#6253 commented on
Apr 12, 2025 • 0 new comments -
[DOC]: 可以提供使用ColossalAI 训练自定义的模型
#6238 commented on
Apr 12, 2025 • 0 new comments -
[BUG]: 用64卡或者80卡训练 loss 总是nan
#6239 commented on
Apr 13, 2025 • 0 new comments -
Hybrid Parallel Plugin下TP显存比同配置下deepspeed要高???
#6161 commented on
Apr 13, 2025 • 0 new comments -
[BUG]: bug in using HybridAdam optimizer
#5223 commented on
Apr 30, 2025 • 0 new comments -
[pre-commit.ci] pre-commit autoupdate
#6179 commented on
May 5, 2025 • 0 new comments