Skip to content

Latest commit

 

History

History
80 lines (62 loc) · 4.09 KB

README.md

File metadata and controls

80 lines (62 loc) · 4.09 KB

TripoSR

Teaser Video

This is the official codebase for TripoSR, a state-of-the-art open-source model for fast feedforward 3D reconstruction from a single image, collaboratively developed by Tripo AI and Stability AI.

Leveraging the principles of the Large Reconstruction Model (LRM), TripoSR brings to the table key advancements that significantly boost both the speed and quality of 3D reconstruction. Our model is distinguished by its ability to rapidly process inputs, generating high-quality 3D models in less than 0.5 seconds on an NVIDIA A100 GPU. TripoSR has exhibited superior performance in both qualitative and quantitative evaluations, outperforming other open-source alternatives across multiple public datasets. The figures below illustrate visual comparisons and metrics showcasing TripoSR's performance relative to other leading models. Details about the model architecture, training process, and comparisons can be found in this technical report.

The model is released under the MIT license, which includes the source code, pretrained models, and an interactive online demo. Our goal is to empower researchers, developers, and creatives to push the boundaries of what's possible in 3D generative AI and 3D content creation.

Getting Started

Installation

  • Python >= 3.8
  • Install CUDA if available
  • Install PyTorch according to your platform: https://pytorch.org/get-started/locally/ [Please make sure that the locally-installed CUDA major version matches the PyTorch-shipped CUDA major version. For example if you have CUDA 11.x installed, make sure to install PyTorch compiled with CUDA 11.x.]
  • Update setuptools by pip install --upgrade setuptools
  • Install other dependencies by pip install -r requirements.txt

Manual Inference

python run.py examples/chair.png --output-dir output/

This will save the reconstructed 3D model to output/. You can also specify more than one image path separated by spaces. The default options takes about 6GB VRAM for a single image input.

For detailed usage of this script, use python run.py --help.

Local Gradio App

Install Gradio:

pip install gradio

Start the Gradio App:

python gradio_app.py

Troubleshooting

AttributeError: module 'torchmcubes_module' has no attribute 'mcubes_cuda'

or

torchmcubes was not compiled with CUDA support, use CPU version instead.

This is because torchmcubes is compiled without CUDA support. Please make sure that

  • The locally-installed CUDA major version matches the PyTorch-shipped CUDA major version. For example if you have CUDA 11.x installed, make sure to install PyTorch compiled with CUDA 11.x.
  • setuptools>=49.6.0. If not, upgrade by pip install --upgrade setuptools.

Then re-install torchmcubes by:

pip uninstall torchmcubes
pip install git+https://github.com/tatsy/torchmcubes.git

Citation

@article{TripoSR2024,
  title={TripoSR: Fast 3D Object Reconstruction from a Single Image},
  author={Tochilkin, Dmitry and Pankratz, David and Liu, Zexiang and Huang, Zixuan and and Letts, Adam and Li, Yangguang and Liang, Ding and Laforte, Christian and Jampani, Varun and Cao, Yan-Pei},
  journal={arXiv preprint arXiv:2403.02151},
  year={2024}
}