PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation

Lukas Meyer*, Floris Erich, Yusuke Yoshiyasu, Marc Stamminger, Noriaki Ando, Yukiyasu Domae
*This work was conducted during an internship at the National Institute of Advanced Industrial Science and Technology.

We introduce Physical Enhanced Gaussian Splatting Simulation System (PEGASUS) for 6DOF object pose dataset generation, a versatile dataset generator based on 3D Gaussian Splatting. Preparation starts by separate scanning of both environments and objects. PEGASUS allows the composition of new scenes by merging the respective underlying Gaussian Splatting point cloud of an environment with one or multiple objects. Leveraging a physics engine enables the simulation of natural object placement within a scene by interacting with their extracted mesh. Consequently, an extensive amount of new scenes - static or dynamic - can be created by combining different environments and objects. By rendering scenes from various perspectives, diverse data points such as RGB images, depth maps, semantic masks, and 6DoF object poses can be extracted. Our study demonstrates that training on data generated by PEGASUS enables pose estimation networks to successfully transfer from synthetic data to real-world data. Moreover, we introduce the CupNoodle dataset, comprising 30 Japanese cup noodle items. This dataset includes spherical scans that captures images from both object hemisphere and the Gaussian Splatting reconstruction, making them compatible with PEGASUS.

Funding and Acknowledgments

This paper is one of the achievements of joint research with and is owned copyrighted material of ROBOT Industrial Basic Technology Collaborative Innovation Partnership. This research has been supported by the New Energy and Industrial Technology Development Organization (NEDO), under the project ID JPNP20016.

Cloning the Repository

The repository contains submodules, thus please check it out with

git clone https://github.com/meyerls/PEGASUS.git --recursive # HTTPS
git submodule update --init --recursive

Requirements

The coda has been tested with the following dependencies:

Python 3.8
Cuda 11.6
PyTorch 1.12.1

Setup

Our default, provided install method is based on Conda package and is provided by the following script. This script has to be executed in the top layer of the repository. Currently, the setup script has only be tested on Ubuntu 20. An installation on windows should be possible but will not be provided in this repo.

./setup.sh

Overview

PEGASUS contains of three main components:

GS Base Environment reconstruction
GS Object Reconstruction
PEGASUS Dataset Extraction

GS Base Environment reconstruction

Click me

Will be updated soon

GS Object Reconstruction

Click me

Will be updated soon! Not yet complete

For object reconstruction we provide two different processing weights. The first is scanning objects in the wild by taking videos from both sides of the object and the second one is using a camera rig to scan the object on a turntable. The first approach uses XMEM to create a segmentation mask of the selected object. For scanning one has to place only an aruco marker into the scene to obtain the correct scale. The turntable approach uses an arbitrary calibration object (I have used a texture rich paper with an aruco marker) to reuse its precomputed camera poses. A detailed workflow is provided in the following section.

In the Wild scanning

The workflow for scanning objects in the wild is:

1. Select Object

Currently it does not work for texture poor objects. Therefore the camera rig is more suitable. The reason is that computing the poses and also registering images from the bottom view does simply not work with COLMAP. Place the object onto a planer scene such as a table and make sure to move all around the object.

2. Aruco Marker

Print out an aruco marker and place it next to the object. For scaling the object measure and note down the size of the square aruco marker. A website to create aruco marker can be found here.

3. Scanning

Record two videos with your phone camera or DSLR camera (We have used an iphone 12 in our example). The first video contains a hemispherical scan of the top view of the object. Try to cover a 360 degree view at 2-3 different height levels. For the second video this process must be repeated for the flipped object.

4. Segmentation Mask

For extracting the semantic masks of the video we used XMEM.

XMEM can be started from the root directory of PEGASUS:

python submodules/XMem/interactive_demo.py --video[path to the video] --num_objects 1 --size -1

In the XMEM GUI select the object you want to extract (the object should be highlighted in red). Afterward click the button Forward Propagate (1) to extract the masks. Depending on the video length it takes around 1-2 min. To save the detected masks click on Export Overlays as Video (2) to save the binary masks as images. More info on how to use XMEM can be found here.

Note: please select the image size according to your GPU size or the quality you want to get. -1 uses the original image size. If you set a value it will resize the image according to its shorter side.

6. Dataset Integration

First both extracted images and masks have to be put into a common folder. This folder should be placed in a dataset folder where multiple reconstructed objects can be stored.

.
└── bouillon 
    ├── down
    │   ├── images
    │   ├── masks
    └── up
        ├── images
        └── masks

To use the scanned object and included it in PEGASUS one has to define the object as a Dataset-Object in in_the_wild_dataset.py. The class (here Bouillon) name takes the name of the object.

class Bouillon(InTheWild):
    OBJECT_NAME = 'bouillon'
    ID = 201
    TYPE = 'object'
    RECORDING_TYPE = 'spherical'  # 'spherical' or 'hemispherical'
    ALPHA = 0.3
    DATASET_TYPE = 'wild'
    ARUCO_SIZE = 0.037  # in meter

    def __init__(self, dataset_path):
        super().__init__(dataset_path=Path(dataset_path))

OBJECT_NAME: folder name of the object. By default it is the video name in the ./workspace folder (this folder gets generated by XMEM). Please rename to the object name.
ID: Unique object ID
TYPE: default type is object. Differs for environment (default: object)
RECORDING_TYPE: 'spherical' or 'hemispherical' depending on if you also scanned the bottom. This is recommend if you have texture-less objects. 'spherical': 2 videos (top & bottom). 'hemispherical': 1 video (top only)
ALPHA: Value for alpha shape reconstruction (default: 0.3)
DATASET_TYPE: name for your own dataset (default: wild)
ARUCO_SIZE: size of the aruco marker in meter(!)

7. GS Reconstruction

python src/reconstruction/in_the_wild_object_reconstruction.py

8. Integrate into PEGASUS

Tbd

Available Objects (Ramen Dataset and PEGASET)

We provide two different datasets. The IDs for the Ramen dataset are between 101 and 130. The YCB-V IDs are identical to the original YCB-V ids.

Ramen Dataset

The Ramen Dataset contains out of 30 cup noodle objects and 9 environments.

.
└── Dataset 
    ├── calibration
    │   ├── ...
    ├── environment
    │   ├── ...
    ├── object
    │   ├── ...
    └── urdf
        └── ...

PEGASET

The PEGASET contains out of the well known 21 YCB-V and 9 environments.

.
└── Dataset 
    ├── calibration
    │   ├── ...
    ├── environment
    │   ├── ...
    ├── object
    │   ├── ...
    └── urdf
        └── ...

PEGASUS Dataset Extraction

Before rendering a dataset the dataset provided for PEGASUS must have been downloaded from Ramen Dataset or PEGASET. If you use both dataset you should merge both into one folder structure.

All objects and environments which are relevant for dataset generation should be added to the obj_list and env_list.

Parameters:

mode: str: Either "dynamic" or "static" rendering of scene
num_scenes: int: Number of scenes
num_objects: int: Maximum number of objects which are placed in the scene. A random number between 1 and num_objects is choosen.
image_height:
image_width:
render_data_points: list: Types of rendering and data points saved to output. e.g. ['rgb', 'depth', 'seg_vis', 'seg_sil', 'sem_seg']
convert_from_scenewise2imagewise: bool: By default the scene is saved per scene. If you need the data in sceneweise BOP-Format set to True

BibTex

@Article{PEGASUS2024,
      author       = {Meyer, Lukas and Erich, Floris and Yoshiyasu, Yusuke and Stamminger, Marc and Ando, Noriaki and Domae, Yukiyasu },
      title        = {PEGASUS: Physical Enhanced Gaussian Splatting Simulation System for 6DOF Object Pose Dataset Generation},
      journal      = {IROS},
      month        = {October},
      year         = {2024},
      url          = {https://meyerls.github.io/pegasus_web}
}

Thanks to the authors of 3D Gaussians for their excellent code, please consider to also cite this repository:

@Article{kerbl3Dgaussians,
      author       = {Kerbl, Bernhard and Kopanas, Georgios and Leimk{\"u}hler, Thomas and Drettakis, George},
      title        = {3D Gaussian Splatting for Real-Time Radiance Field Rendering},
      journal      = {ACM Transactions on Graphics},
      number       = {4},
      volume       = {42},
      month        = {July},
      year         = {2023},
      url          = {https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/}
}

And thanks to authors of the BOP Toolkit for their benchmark for 6D object pose estimation interface.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
src		src
submodules		submodules
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pegasus.py		pegasus.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation

Funding and Acknowledgments

Cloning the Repository

Requirements

Setup

Overview

GS Base Environment reconstruction

GS Object Reconstruction

In the Wild scanning

1. Select Object

2. Aruco Marker

3. Scanning

4. Segmentation Mask

6. Dataset Integration

7. GS Reconstruction

8. Integrate into PEGASUS

Available Objects (Ramen Dataset and PEGASET)

Ramen Dataset

PEGASET

PEGASUS Dataset Extraction

BibTex

About

Languages

License

meyerls/PEGASUS

Folders and files

Latest commit

History

Repository files navigation

PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation

Funding and Acknowledgments

Cloning the Repository

Requirements

Setup

Overview

GS Base Environment reconstruction

GS Object Reconstruction

In the Wild scanning

1. Select Object

2. Aruco Marker

3. Scanning

4. Segmentation Mask

6. Dataset Integration

7. GS Reconstruction

8. Integrate into PEGASUS

Available Objects (Ramen Dataset and PEGASET)

Ramen Dataset

PEGASET

PEGASUS Dataset Extraction

BibTex

About

Topics

Resources

License

Stars

Watchers

Forks

Languages