Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Hanwen Jiang, Qixing Huang, Georgios Pavlakos

Project Page | Paper | HuggingFace Demo

Abstract: As single-view 3D reconstruction is ill-posed due to the ambiguity from 2D to 3D, the reconstruction models have to learn generic shape and texture priors from large data. The default strategy for training single-view Large Reconstruction Models (LRMs) follows the fully supervised route, using synthetic 3D assets or multi-view captures. Although these resources simplify the training procedure, they are hard to scale up beyond the existing datasets and they are not necessarily representative of the real distribution of object shapes. To address these limitations, in this paper, we introduce Real3D, the first LRM system that can be trained using single-view real-world images. Real3D introduces a novel self-training framework that can benefit from both the existing 3D/multi-view synthetic data and diverse single-view real images. We propose two unsupervised losses that allow us to supervise LRMs at the pixel- and semantic-level, even for training examples without ground-truth 3D or novel views. To further improve performance and scale up the image data, we develop an automatic data curation approach to collect high-quality examples from in-the-wild images. Our experiments show that Real3D consistently outperforms prior work in four diverse evaluation settings that include real and synthetic data, as well as both in-domain and out-of-domain shapes.

Installation

First, install the environment.

conda create --name real3d python=3.8
conda activate real3d

# Install pytorch, we use:
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=11.8 -c pytorch -c nvidia

pip install -r requirements.txt

Then, download the model weight and put it to ./checkpoints/model_both_trained_v1.ckpt.

Demo

Use ./run.sh and modify your image path and foreground segmentation config accordingly. Tune the chunk size to fit your GPU.

Training

Data preparation

This repo uses MVImgNet, CO3D, OmniObject3D and our collected real images. Please see this file.

Step 0: (Optional) Fine-tune TripoSR

As TripoSR predicts 3D shapes with randomized scales, we first need to fine-tune it on Objaverse. We provide the fine-tuned model weight, so you can put it to ./checkpoint/model_both.ckpt and skip this stage.

Step 1: Self-training on real images

Use ./train_sv.sh.

Evaluation

Use ./eval.sh and modify the script and config accordingly. For example, to evaluate on CO3D with ground-truth multiviews, use eval_mv.py and ./config/eval/eval_mv_co3d.yaml. To evaluate on single-view images, use eval_sv.py and ./config/eval/eval_sv.yaml.

TODO

Release real-world data.

Acknowledgement

This repo is developed based on TripoSR.

BibTex

@article{jiang2024real3d,
   title={Real3D: Scaling Up Large Reconstruction Models with Real-World Images},
   author={Jiang, Hanwen and Huang, Qixing and Pavlakos, Georgios},
   booktitle={arXiv preprint arXiv:2406.08479},
   year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
checkpoint		checkpoint
config		config
dataset		dataset
scripts		scripts
tsr		tsr
utils		utils
.gitignore		.gitignore
README.md		README.md
eval.sh		eval.sh
eval_mv.py		eval_mv.py
eval_sv.py		eval_sv.py
requirements.txt		requirements.txt
run.py		run.py
run.sh		run.sh
train.py		train.py
train.sh		train.sh
train_sv.sh		train_sv.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Installation

Demo

Training

Data preparation

Step 0: (Optional) Fine-tune TripoSR

Step 1: Self-training on real images

Evaluation

TODO

Acknowledgement

BibTex

About

Releases

Packages

Languages

hwjiang1510/Real3D

Folders and files

Latest commit

History

Repository files navigation

Real3D: Scaling Up Large Reconstruction Models with Real-World Images

Installation

Demo

Training

Data preparation

Step 0: (Optional) Fine-tune TripoSR

Step 1: Self-training on real images

Evaluation

TODO

Acknowledgement

BibTex

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages