Stars
This repository contains code for object detection and tracking in videos using the YOLOv10 object detection model and the DeepSORT algorithm.
Ultralytics YOLOv8, YOLOv9, YOLOv10, YOLOv11 for ROS 2
tensorrt for yolo series (YOLOv11,YOLOv10,YOLOv9,YOLOv8,YOLOv7,YOLOv6,YOLOX,YOLOv5), nms plugin support
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
🚀🚀🚀 YOLO series of PaddlePaddle implementation, PP-YOLOE+, RT-DETR, YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv10, YOLOX, YOLOv5u, YOLOv7u, YOLOv6Lite, RTMDet and so on. 🚀🚀🚀
Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
Implementation of popular deep learning networks with TensorRT network definition API
The dataset for drone based detection and tracking is released, including both image/video, and annotations.
FSL-Mate: A collection of resources for few-shot learning (FSL).
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
HuatuoGPT2, One-stage Training for Medical Adaption of LLMs. (An Open Medical GPT)
Basic UNet, AUNet, and ResNet architecture models and new variations: Connected-UNets, Connected-AUNets, and Connected-ResUNets architecture models
MambaOut: Do We Really Need Mamba for Vision?
🎨 ML Visuals contains figures and templates which you can reuse and customize to improve your scientific writing.
RepVGG: Making VGG-style ConvNets Great Again
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
Implementation of different kinds of Unet Models for Image Segmentation - Unet , RCNN-Unet, Attention Unet, RCNN-Attention Unet, Nested Unet
This is the official repository for the paper "3D TransUNet: Advancing Medical Image Segmentation through Vision Transformers"
This repository includes the official project of TransUNet, presented in our paper: TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
VMamba: Visual State Space Models,code is based on mamba
DeepDenoiser: Seismic Signal Denoising and Decomposition Using Deep Neural Networks
The Python micro framework for building web applications.
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.