Highlights
- Pro
Stars
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
This is a RAG implementation using Open Source stack. BioMistral 7B has been used to build this app along with PubMedBert as an embedding model, Qdrant as a self hosted Vector DB, and Langchain & L…
Invoicing, Time tracking, File reconciliation, Storage, Financial Overview & your own Assistant made for Freelancers
Production-ready, Light, Flexible and Extensible ASGI API framework | Effortlessly Build Performant APIs
A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK)
Python-centered read-along of Forecasting: Principles and Practice
the AI-native open-source embedding database
A curated list of GAN & Deepfake papers and repositories.
这是一个用于显示当前网速、CPU及内存利用率的桌面悬浮窗软件,并支持任务栏显示,支持更换皮肤。
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
A website dedicated to showcasing the profiles of prominent Pakistani researchers in the field of AI.
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🔊 Text-Prompted Generative Audio Model
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
High-Resolution Image Synthesis with Latent Diffusion Models
A simple extension for Jupyter Notebook and Jupyter Lab to beautify Python code automatically using black.