Stars
🦜🔗 Build context-aware reasoning applications
A latent text-to-image diffusion model
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
pytorch handbook是一本开源的书籍,目标是帮助那些希望和使用PyTorch进行深度学习开发和研究的朋友快速入门,其中包含的Pytorch教程全部通过测试保证可以成功运行
Instruct-tune LLaMA on consumer hardware
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
High-Resolution Image Synthesis with Latent Diffusion Models
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
QLoRA: Efficient Finetuning of Quantized LLMs
Code release for NeRF (Neural Radiance Fields)
LAVIS - A One-stop Library for Language-Vision Intelligence
《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程
COCO API - Dataset @ http://cocodataset.org/
Taming Transformers for High-Resolution Image Synthesis
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
[ICCV 2019] Monocular depth estimation from a single image
Tutorials for creating and using ONNX models
We unified the interfaces of instruction-tuning data (e.g., CoT data), multiple LLMs and parameter-efficient methods (e.g., lora, p-tuning) together for easy use. We welcome open-source enthusiasts…
A Unified Library for Parameter-Efficient and Modular Transfer Learning
Scenarios, tutorials and demos for Autonomous Driving
Metric depth estimation from a single image
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
An unsupervised learning framework for depth and ego-motion estimation from monocular videos
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conv…
OneDiff: An out-of-the-box acceleration library for diffusion models.