Stars
A guidance language for controlling large language models.
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
Generalist YOLO: Towards Real-Time End-to-End Multi-Task Visual Language Models
Official Implementation of DINO-Foresight: Looking into the Future with DINO
ripgrep recursively searches directories for a regex pattern while respecting your gitignore
Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …
Open Source Data Annotation & Labeling Tools
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
🐍 Geometric Computer Vision Library for Spatial AI
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
The code for the paper "Efficient Self-Supervised Video Hashing with Selective State Spaces" (AAAI'25).
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
Clips AI is an open-source Python library that automatically converts long videos into clips.
AI-Video-Cropper is a Python-based tool that leverages the power of GPT-4 (OpenAI's language model) to automatically analyze videos, extract the most interesting sections, and crop them for improve…
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
💎 Detect , track and extract the optimal face in multi-target faces (exclude side face and select the optimal face).
A really more real-time adaptation of deep sort
An MIT License of YOLOv9, YOLOv7, YOLO-RD
Graph learning framework for long-term video understanding
OpenOCR: A general OCR system with accuracy and efficiency. Supporting 24 Scene Text Recognition methods trained from scratch on large-scale real datasets, and will continue to add the latest methods.
Simple Implementation of Pix2Seq model for object detection in PyTorch
Find, verify, and analyze leaked credentials
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
📝Awesome and classical image retrieval papers