-
Huazhong University of Science and Technology
- Wuhan
-
05:48
(UTC +08:00) - https://zc2023.github.io/
- https://orcid.org/0000-0001-6831-5103
- https://scholar.google.com/citations?user=YVDMI8EAAAAJ&hl=en
- https://orcid.org/0000-0001-6831-5103
Highlights
- Pro
Stars
A Gradio web UI for Large Language Models with support for multiple inference backends.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep lear…
A PyTorch implementation of the Transformer model in "Attention is All You Need".
OpenMMLab's next-generation platform for general 3D object detection.
OpenPCDet Toolbox for LiDAR-based 3D Object Detection.
Minkowski Engine is an auto-diff neural network library for high-dimensional sparse tensors
VMamba: Visual State Space Models,code is based on mamba
MambaOut: Do We Really Need Mamba for Vision?
Code for a series of work in LiDAR perception, including SST (CVPR 22), FSD (NeurIPS 22), FSD++ (TPAMI 23), FSDv2, and CTRL (ICCV 23, oral).
[ICCV 2023] StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection
Mask3D predicts accurate 3D semantic instances achieving state-of-the-art on ScanNet, ScanNet200, S3DIS and STPLS3D.
This is an unofficial implementation of the Point Transformer paper.
[ICML 2024] Official code repository for 3D embodied generalist agent LEO
[CVPR2023] Official Implementation of "DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets"
[NeurIPS 2024] PointMamba: A Simple State Space Model for Point Cloud Analysis
[CVPR2024] OneFormer3D: One Transformer for Unified Point Cloud Segmentation
Official implementation of ECCV24 paper "SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding"
A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World
[NeurIPS 2024] A Unified Framework for 3D Scene Understanding
[NIPS'24] Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection
Code&Data for Grounded 3D-LLM with Referent Tokens
Code for the paper "Masked Autoencoders for Self-Supervised Learning on Automotive Point Clouds"
Official implementation of the paper "Unifying 3D Vision-Language Understanding via Promptable Queries"