Stars
Paddle.js is a web project for Baidu PaddlePaddle, which is an open source deep learning framework running in the browser. Paddle.js can either load a pre-trained model, or transforming a model fro…
⚡FlashRAG: A Python Toolkit for Efficient RAG Research
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
DeepStream SDK Python bindings and sample applications
OpenMMLab Detection Toolbox and Benchmark
Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
This repository contains demos I made with the Transformers library by HuggingFace.