Starred repositories
[ICRA2025]: MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues
PyTorch code and models for the DINOv2 self-supervised learning method.
Generating Robotic Simulation Tasks via Large Language Models
Code and data for Vitruvion: A Generative Model of Parametric CAD Sketches (ICLR 2022)
moojink / openvla-oft
Forked from openvla/openvlaFine-Tuning Vision-Language-Action Models: Optimizing Speed and Success
Augment robotics demonstration datasets with different robots and viewpoints
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The official repo for "SpatialBot: Precise Spatial Understanding with Vision Language Models.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
[ICML 2023] Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
DROID Policy Learning and Evaluation
Official Implementation of the CrossMAE paper: Rethinking Patch Dependence for Masked Autoencoders
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
This note presents in a technical though hopefully pedagogical way the three most common forms of neural network architectures: Feedforward, Convolutional and Recurrent.
Tools to Design or Visualize Architecture of Neural Network
Code examples in pyTorch and Tensorflow for CS230
we want to create a repo to illustrate usage of transformers in chinese
The official repo for the paper "In-Context Imitation Learning via Next-Token Prediction"
Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).
Popular J2ME application for GPS navigation in mobile phone