Stars
Source code for the paper "Empowering LLM to use Smartphone for Intelligent Task Automation"
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
OpenAI 接口管理 & 分发系统,支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元,可用于二次分发管理 key,仅单可执行文件,已打包好 Docker 镜像,一键部署,开箱即用. OpenAI key management & redistributi…
Python package for combining diarization system outputs.
Clustering-based methods for overlapping diarization
Spot the conversation: speaker diarisation in the wild
A flexible framework powered by ComfyUI for generating personalized Nobel Prize images.
Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
Java wrapper around the FFmpeg command line tool
[ECCV'24] Kalman-Inspired Feature Propagation for Video Face Super-Resolution
[Pattern Recognition'24] Looking Beyond Input Frames: Self-Supervised Adaptation for Video Super-Resolution
SAVSR: Arbitrary-Scale Video Super-Resolution via A Learned Scale-Adaptive Architecture (AAAI'2024)
[CVPR 2024] Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Solution of the NTIRE 2024 Challenge on Efficient Super-Resolution
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.
🦜🔗 Build context-aware reasoning applications
Transparent Image Layer Diffusion using Latent Transparency
This is official implementtaion of "VmambaIR: Visual State Space Model for Image Restoration"
[ICLR 2023 Oral] Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model