Stars
Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson
Curation of resources for LLM research, screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers get the gist as quickly as possible.
Papers, datasets, and resources related to 2D cartoon video research. Contributions welcome.
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
TheDenk / ControledAnimateDiff
Forked from guoyww/AnimateDiffControlnet extension of AnimateDiff.
[NeurIPS D&B Track 2024] Official implementation of HumanVid
A work list of recent human video generation method. This repository focus on half/full body human video generation method, The Nerf, Gaussian splashing, Motion Pose, and talking head/Portrait is n…
✨✨Latest Advances on Multimodal Large Language Models
Awesome speech/audio LLMs, representation learning, and codec models
深度学习系统笔记,包含深度学习数学基础知识、神经网络基础部件详解、深度学习炼丹策略、模型压缩算法详解。
记录cv算法工程师的成长之路,分享计算机视觉和模型压缩部署技术栈笔记。https://harleyszhang.github.io/cv_note/
A universal Stable-Diffusion toolbox
Character Animation (AnimateAnyone, Face Reenactment)
Open-Sora: Democratizing Efficient Video Production for All
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
PyTorch reimplementation of Diffusion Models
python | 高效使用统计语言模型kenlm:新词发现、分词、智能纠错等
深度学习总结,包含基础知识、目标检测、目标跟踪、目标分类,深度学习八股文,相关竞赛等
Unofficial Implementation of Animate Anyone
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
深度学习面试宝典(含数学、机器学习、深度学习、计算机视觉、自然语言处理和SLAM等方向)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Instant voice cloning by MIT and MyShell. Audio foundation model.