Stars
ComfyUI-EdgeTTS is a powerful text-to-speech node for ComfyUI, leveraging Microsoft's Edge TTS capabilities. It enables seamless conversion of text into natural-sounding speech, supporting multiple…
A Text To Speech node using Step-Audio-TTS in ComfyUI. Can speak, rap, sing, or clone voice.
A cross-platform private song playback service.
Toolkit for linearizing PDFs for LLM datasets/training
A voice conversion extension node for ComfyUI based on FreeVC, enabling high-quality voice conversion capabilities within the ComfyUI framework.
Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Wheels for llama-cpp-python compiled with cuBLAS support
woct0rdho / triton-windows
Forked from triton-lang/tritonFork of the Triton language and compiler for Windows support
Development repository for the Triton language and compiler
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
A simple screen parsing tool towards pure vision based GUI agent
SkyReels V1: The first and most advanced open-source human-centric video foundation model
The ultimate javascript content-type utility.
Next generation frontend tooling. It's fast!
🎬 卡卡字幕助手 | VideoCaptioner - 基于 LLM 的智能字幕助手 - 视频字幕生成、断句、校正、字幕翻译全流程处理!- A powered tool for easy and efficient video subtitling.
A utility to give some insight into how you use your keyboard
Keyviz is a free and open-source tool to visualize your keystrokes ⌨️ and 🖱️ mouse actions in real-time.
A set of nodes to edit videos using the Hunyuan Video model