Stars
Three.js-based implementation of 3D Gaussian splatting
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Bring your code to the conversations you care about with GitHub's integration for Slack
Implementation of Fréchet Distance with DINOv2 backbone in Pytorch.
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
NativeLink is an open source high-performance build cache and remote execution server, compatible with Bazel, Buck2, Reclient, and other RBE-compatible build systems. It offers drastically faster b…
[NeurIPS 2024 Spotlight] Implementation of the paper "3D Gaussian Splatting as Markov Chain Monte Carlo"
InstantSplat: Sparse-view SfM-free Gaussian Splatting in Seconds
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
[ECCV'2024] Gaussian Grouping for open-world Anything reconstruction, segmentation and editing.
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"
A modern model graph visualizer and debugger
🛰️ An approximate nearest-neighbor search library for Python and Java with a focus on ease of use, simplicity, and deployability.
Code for the paper: "No Zero-Shot Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance" [NeurIPS'24]
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
Official Open Source code for "Scaling Language-Image Pre-training via Masking"
[CVPR2024] NeuRAD: Neural Rendering for Autonomous Driving
Periodic Vibration Gaussian: Dynamic Urban Scene Reconstruction and Real-time Rendering
PyTorch code and models for V-JEPA self-supervised learning from video.
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍