-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Insights: NVIDIA/NeMo
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v2.2.1 NVIDIA Neural Modules 2.2.1
published
Mar 31, 2025
25 Pull requests merged by 15 people
-
Update base container in
Dockerfile.speech
#12859 merged
Apr 2, 2025 -
ci: Increase prune time
#12860 merged
Apr 2, 2025 -
[automodel] qlora peft
#12817 merged
Apr 2, 2025 -
Update resiliency example notebook readme and add links to the brev launchable
#12843 merged
Apr 2, 2025 -
[audio] Adding tests for predictive models
#12823 merged
Apr 2, 2025 -
Fix timestamps when cuda graphs enabled
#12808 merged
Apr 2, 2025 -
Fix trt-llm install
#12827 merged
Apr 1, 2025 -
Prune docker images in GHA older than 8hrs
#12838 merged
Apr 1, 2025 -
add missing call to _apply_liger_kernel_to_instance
#12806 merged
Apr 1, 2025 -
[automodel] add cpu:gloo to backend
#12832 merged
Apr 1, 2025 -
TP optimization and code cleanup for llama3 70b LoRA
#12488 merged
Apr 1, 2025 -
add finetune support for Auto Configurator
#12770 merged
Apr 1, 2025 -
transcribe fix for new hypotheses
#12801 merged
Mar 31, 2025 -
Cherry pick
Update changelog for
r2.2.1(12818)
intor2.2.0
#12820 merged
Mar 31, 2025 -
Update changelog for
r2.2.1
#12818 merged
Mar 31, 2025 -
Alit/hyena recipes
#12582 merged
Mar 31, 2025 -
ci: Exclude nlp, mm, vision collections
#12816 merged
Mar 31, 2025 -
add __init__.py to make this a package
#12814 merged
Mar 31, 2025 -
ci: Move scripts fully down to files
#12802 merged
Mar 31, 2025 -
ci: Remove
--branch
#12809 merged
Mar 31, 2025 -
Fix TransformerBlock cuda_graphs compatibility with MCore
#12779 merged
Mar 31, 2025 -
Add BERT/Qwen2.5 Unit test and Refactor all GHA Conversion Tests
#12785 merged
Mar 31, 2025 -
ci: Fix flaky LLM tests
#12807 merged
Mar 30, 2025 -
ci: Measure multiprocessing
#12778 merged
Mar 29, 2025 -
[doc] Fixes for audio doc warnings
#12736 merged
Mar 27, 2025
27 Pull requests opened by 22 people
-
test
#12803 opened
Mar 28, 2025 -
Non-blocking checkpoint cleanup failure
#12804 opened
Mar 28, 2025 -
handle identical duration bins
#12810 opened
Mar 31, 2025 -
ci: Horizontal fail-fast
#12811 opened
Mar 31, 2025 -
Fix vad param confusion
#12812 opened
Mar 31, 2025 -
ci: Bump dependencies
#12819 opened
Mar 31, 2025 -
[automodel] Add FSDPv2-compatible context parallelism support.
#12821 opened
Mar 31, 2025 -
Allow configuring signal for signal handling
#12824 opened
Mar 31, 2025 -
[automodel] Add linear ce loss support
#12825 opened
Mar 31, 2025 -
DeepseekV3 SFT finetuning perf config
#12829 opened
Apr 1, 2025 -
Adds support for SFT and Llama + Qwen converters
#12830 opened
Apr 1, 2025 -
Add energon dataset support for Qwen2VL
#12831 opened
Apr 1, 2025 -
Handle CUDA_DEVICE_MAX_CONNECTIONS before job launch
#12833 opened
Apr 1, 2025 -
[fault tolerance] Add local checkpointing support
#12839 opened
Apr 1, 2025 -
Update LLaVA's next HF exporter to load ViT checkpoint from YAML
#12841 opened
Apr 1, 2025 -
Fine tuning data module restructure for huggingface datasets
#12842 opened
Apr 1, 2025 -
update streaming conformer
#12846 opened
Apr 1, 2025 -
[not ready] Saves checkpoints at specified steps
#12847 opened
Apr 1, 2025 -
[automodel] hsdp
#12851 opened
Apr 2, 2025 -
Fix qwen2.5 1.5b configuration inheritance bug
#12852 opened
Apr 2, 2025 -
Fix pack error when using custom prompt_template
#12854 opened
Apr 2, 2025 -
Remove Unexecuted Hyena Code Paths
#12856 opened
Apr 2, 2025 -
Improve evo2 dataset test and testability
#12857 opened
Apr 2, 2025 -
Guard decord import and update nvidia-resiliency-ext
#12861 opened
Apr 2, 2025 -
Add vocab size as attr to GPT and T5 Configs, use file name based logger in llm.gpt.data
#12862 opened
Apr 2, 2025 -
Implement Speculative transform script for GPT models
#12863 opened
Apr 3, 2025 -
[automodel] fix hellaswag tokenizer call
#12864 opened
Apr 3, 2025
3 Issues closed by 3 people
-
Model creates duplicate transcriptions
#12442 closed
Mar 30, 2025 -
Broken offline mode of NeMo
#11899 closed
Mar 30, 2025
14 Issues opened by 14 people
-
No signal.SIGKILL attribute error under MS Windows 11
#12858 opened
Apr 2, 2025 -
generating manifest file for training titanet-large
#12853 opened
Apr 2, 2025 -
Configuraion inheritance bug with Qwen2.5 1.5B
#12849 opened
Apr 2, 2025 -
canary-1b-flash not generalizing
#12845 opened
Apr 1, 2025 -
canary-1b-flash missing punctuation when timestamps are enabled
#12844 opened
Apr 1, 2025 -
Cache Aware Streaming script yields different results for different batch_sizes
#12840 opened
Apr 1, 2025 -
speechllm TP bug
#12837 opened
Apr 1, 2025 -
Bug Report: Failed to Build causal-conv1d During NeMo Installation
#12835 opened
Apr 1, 2025 -
Bug after computing timestamps and it's not supported for canary_model?
#12834 opened
Apr 1, 2025 -
Deepseek-v2 pretrain recipe doesn't work
#12828 opened
Apr 1, 2025 -
Is there any plan for supporting Qwen2.5 VL?
#12813 opened
Mar 31, 2025 -
torch.distributed.DistNetworkError
#12805 opened
Mar 29, 2025 -
Discrepancy in custom transcribe pipeline vs. `model.transcribe()` for QuartzNet model
#12800 opened
Mar 27, 2025 -
BUG - ASR - Finetuned hybrid model timestamps
#12799 opened
Mar 27, 2025
66 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Batched beam search for transducers (RNN-T and TDT)
#12729 commented on
Apr 2, 2025 • 23 new comments -
add nemotron5
#12660 commented on
Apr 2, 2025 • 9 new comments -
WIP: S2S collection
#12617 commented on
Apr 2, 2025 • 7 new comments -
[automodel] add linear loss function
#12772 commented on
Apr 1, 2025 • 7 new comments -
Enable in-fw deployment for eval with OAI compatible server
#12101 commented on
Apr 2, 2025 • 6 new comments -
feat: add support for checkpointing with multistorageclient for Exp M…
#12747 commented on
Apr 2, 2025 • 4 new comments -
Adds Standard transformer flops formula to flops callback
#12774 commented on
Apr 2, 2025 • 3 new comments -
Add PerceiverLayer implementation
#12618 commented on
Mar 31, 2025 • 3 new comments -
TRT-LLM tests
#12671 commented on
Apr 3, 2025 • 2 new comments -
Mingyuanm/GitHub ci flux
#12761 commented on
Apr 2, 2025 • 2 new comments -
Commiting changes for nemo-run asr
#12748 commented on
Mar 31, 2025 • 2 new comments -
ONNX and TRT Export Test for LLM Embedding Models
#12692 commented on
Apr 3, 2025 • 1 new comment -
Add Blockwise FP8 to PTQ & EP to modelopt resume
#12670 commented on
Apr 2, 2025 • 1 new comment -
Add unit tests for evaluation
#12705 commented on
Mar 31, 2025 • 1 new comment -
RNNT Timestamps Fix & Timestamps in transcribe_speech_parallel
#12707 commented on
Apr 2, 2025 • 1 new comment -
Add packed sequences for HFDatasetDataModule
#12551 commented on
Mar 31, 2025 • 1 new comment -
Testing deploy_inframework and query_inframework scripts
#12667 commented on
Apr 3, 2025 • 1 new comment -
[Automodel] Add TP/SP support
#12796 commented on
Apr 2, 2025 • 1 new comment -
Fix GPT HF Exporter dtype and head_dim
#12792 commented on
Mar 31, 2025 • 1 new comment -
Fix bugs in `AudioToMelSpectrogramPreprocessor.input_example`
#12063 commented on
Mar 30, 2025 • 1 new comment -
Add safetensor option when saving and restoring models
#11549 commented on
Mar 30, 2025 • 1 new comment -
make a outer forward function for FluxControlnet
#12662 commented on
Apr 2, 2025 • 0 new comments -
sft nemo2.0 checkin
#12794 commented on
Mar 27, 2025 • 0 new comments -
Expose NCCL timeout via existing env var
#12669 commented on
Mar 31, 2025 • 0 new comments -
GB200 LLM performance scripts tuning
#12791 commented on
Apr 2, 2025 • 0 new comments -
Update ssm.py
#12687 commented on
Apr 3, 2025 • 0 new comments -
Update modelopt upperbound to 0.27
#12788 commented on
Apr 3, 2025 • 0 new comments -
Adding more doc-strings to megatron_parallel.py
#12767 commented on
Mar 31, 2025 • 0 new comments -
Bugfix eval test
#12704 commented on
Apr 1, 2025 • 0 new comments -
feat: add support for nemo 2.0 checkpointing with multistorageclient
#12746 commented on
Apr 1, 2025 • 0 new comments -
Adding tests for more coverage
#12713 commented on
Mar 28, 2025 • 0 new comments -
Expand test converage neva / mllama
#12715 commented on
Apr 2, 2025 • 0 new comments -
Export and Deploy Unit Tests
#12717 commented on
Apr 3, 2025 • 0 new comments -
Test TE-free path for peft
#12733 commented on
Apr 2, 2025 • 0 new comments -
Remove code related to canonical adapters since it's not supported in NeMo 2.0
#12724 commented on
Apr 2, 2025 • 0 new comments -
AED Decoding with N-Gram LM
#12730 commented on
Apr 2, 2025 • 0 new comments -
NeMo is not friendly to HF compatibility.
#12166 commented on
Mar 28, 2025 • 0 new comments -
when i use container to do sft for any model, it has context not found error
#11825 commented on
Mar 30, 2025 • 0 new comments -
russian optimal vocab_size
#12740 commented on
Apr 1, 2025 • 0 new comments -
Unable to find JitConfig & JitTransform in nemo/lightning/pytorch/callbacks
#12710 commented on
Apr 2, 2025 • 0 new comments -
Fix checkpoint loading when lm_head is on separate pipeline stage
#10769 commented on
Apr 1, 2025 • 0 new comments -
Add global state cleanup function
#11172 commented on
Apr 1, 2025 • 0 new comments -
Add nemo1 to nemo2 conversion for neva
#11860 commented on
Mar 30, 2025 • 0 new comments -
fix(huggingface-hub): allow offline mode
#11901 commented on
Mar 30, 2025 • 0 new comments -
Make TETransformerLayerAutocast Support Cuda Graph
#12075 commented on
Mar 31, 2025 • 0 new comments -
Remove getattr_proxy to avoid problematic edge cases
#12176 commented on
Apr 3, 2025 • 0 new comments -
Fix: 'IterableDatasetWrapper' has no len() when using Lhotse datasets
#12190 commented on
Mar 29, 2025 • 0 new comments -
Update L2_NeMo_2_NeMo_Mcore_Mixtral_bitexact to reenable failure on mismatch
#12233 commented on
Mar 30, 2025 • 0 new comments -
change loss return format so that it can work with calculate_per_token_loss
#12459 commented on
Mar 29, 2025 • 0 new comments -
Nemo run ipl
#12470 commented on
Apr 1, 2025 • 0 new comments -
first commit
#12477 commented on
Mar 30, 2025 • 0 new comments -
initialize model with metadata
#12496 commented on
Mar 28, 2025 • 0 new comments -
Remove adapter_path from base AutoResume and refactor PEFT checkpoint handling
#12565 commented on
Apr 2, 2025 • 0 new comments -
[resiliency] Add in process integration for Nemo2
#12589 commented on
Apr 1, 2025 • 0 new comments -
Minimize overhead from asynchronous checkpointing
#12590 commented on
Mar 29, 2025 • 0 new comments -
Heh/speech conv dev
#12598 commented on
Apr 1, 2025 • 0 new comments -
ci: pip-installable automodels examples
#12609 commented on
Mar 30, 2025 • 0 new comments -
Feature/wsd scheduler
#12611 commented on
Apr 2, 2025 • 0 new comments -
Hugging Face model deployment support
#12628 commented on
Apr 3, 2025 • 0 new comments -
Ptq parallel config dataclass
#12632 commented on
Apr 1, 2025 • 0 new comments -
Enable use_sharp argument
#12636 commented on
Apr 1, 2025 • 0 new comments -
Variable global and micro batch sizes for different GPUs
#12640 commented on
Mar 29, 2025 • 0 new comments -
[automodel] expose gradient clip value for automodel recipe
#12648 commented on
Mar 31, 2025 • 0 new comments -
Alit/fw eval nm5 ux
#12658 commented on
Apr 2, 2025 • 0 new comments -
FSDP2 support for MultiTask AED models (Canary)
#12661 commented on
Apr 3, 2025 • 0 new comments