-
The AI Institute
- Boston
-
17:05
(UTC -05:00) - https://www3.cs.stonybrook.edu/~jishang/
Starred repositories
Motion-Controllable Video Diffusion via Warped Noise
Official repo of ICLR'25 - LLaRA: Large Language and Robotics Assistant
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
ASCII generator (image to text, image to image, video to video)
Adaptive Caching for Faster Video Generation with Diffusion Transformers
[ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
[ECCV2022] [T-PAMI] StARformer: Transformer with State-Action-Reward Representations.
Code for NeurIPS 2023 paper "Active Vision Reinforcement Learning with Limited Visual Observability"
Environments for Active Vision Reinforcement Learning
PyTorch code and pretrained weights for the UNIC models.
Parallel t-SNE implementation with Python and Torch wrappers.
Assessing Sample Quality via the Latent Space of Generative Models (ECCV 2024)
Official repository for "AM-RADIO: Reduce All Domains Into One"
Inceptive Visual Representation Learning with Diverse Attention Across Heads. Image Classification, Action Recognition, and Robot Learning.
Language Repository for Long Video Understanding
Official repo of ICRA'24 - Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
An unofficial pytorch dataloader for Open X-Embodiment Datasets https://github.com/google-deepmind/open_x_embodiment
CoTracker is a model for tracking any point (pixel) on a video.
A simple and highly efficient RTS-game-inspired environment for reinforcement learning (formerly Gym-MicroRTS)
Official Code for PathLDM: Text conditioned Latent Diffusion Model for Histopathology (WACV 2024)
This repository contains the implementation for our work "Topology-Aware Uncertainty for Image Segmentation", accepted to NeurIPS 2023.