Lists (1)
Sort Name ascending (A-Z)
Stars
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A suite of image and video neural tokenizers
Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"
Python library & framework to build custom translators for the hearing-impaired and translate between Sign Language & Text using Artificial Intelligence.
🔥 A minimal training framework for scaling FLA models
一款 支持从百度、网易、qq、酷狗、咪咕等音乐网站搜索并下载歌曲的程序,支持下载无损音乐
Implementations of various linear RNN layers using pytorch and triton
Triton implement of bi-directional (non-causal) linear attention
MLNLP社区用来帮助缩短参考文献的工具。A tool for simplifying bibtex with official info
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Official code for "SRFormer: Permuted Self-Attention for Single Image Super-Resolution" (ICCV 2023) and SRFormerV2
Torchvision-like Deformable Convolution with both 1D, 2D, 3D operators, and their transposed versions.
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A pytorch implementation of the paper "ANSD-MA-MSE: Adaptive Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding"
A vue-based project page template for academic papers. (in development) https://junyaohu.github.io/academic-project-page-template-vue
Offical implementation of "MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map" (NeurIPS2024)