Add support for importing semantics from https://3dscenegraph.stanford.edu for use with gibson dataset #374

msbaines · 2019-12-10T21:12:24Z

🚀 Feature

We want to be able to use the semantic dataset from https://3dscenegraph.stanford.edu/ with the scene dataset from http://gibsonenv.stanford.edu/

msbaines · 2019-12-10T21:14:49Z

The 3dscenegraph semantic dataset is currently limited to gibson_tiny. However, semantics for gibson_medium are expected to be release soon.

msbaines · 2019-12-10T21:20:32Z

The mesh used for semantics is different from the mesh used for Habitat. The coordinate system is also different but both meshes. The Y and Z axis are switched but the origin is the same. We should be able to generate a semantic ply mesh from the origin .obj mesh by transforming vertex coordinates.

msbaines · 2019-12-11T19:02:29Z

The semantic data from 3DSceneGraph is available via an npz file.

To access the data:

data = np.load(npz_path, allow_pickle=True)['output'].item()

This will return the following dictionary:

dict_keys(['building', 'room', 'object', 'camera', 'panorama'])

building

>>> pprint.pprint(data['building'])
{'floor_area': 35.04333970662052,
 'function': 'residential',
 'gibson_split': 'tiny',
 'id': 2,
 'name': 'Allensville',
 'num_cameras': 26,
 'num_floors': 1,
 'num_objects': 33,
 'num_rooms': 11,
 'object_inst_segmentation': array([[0.],
       [0.],
       [0.],
       ...,
       [0.],
       [0.],
       [0.]]),
 'object_voxel_occupancy': array([[0.],
       [0.],
       [0.],
       ...,
       [0.],
       [0.],
       [0.]]),
 'reference_point': array([0., 0., 0.]),
 'room_inst_segmentation': array([[10.],
       [ 8.],
       [ 8.],
       ...,
       [11.],
       [11.],
       [11.]]),
 'room_voxel_occupancy': array([[0.],
       [0.],
       [0.],
       ...,
       [0.],
       [0.],
       [0.]]),
 'size': array([9.76260114, 8.85856447, 2.50682107]),
 'volume': 201.63530725199752,
 'voxel_centers': array([[-0.98990329, -0.97658977, -0.02014603],
       [-0.98990329, -0.97658977,  0.07985397],
       [-0.98990329, -0.97658977,  0.17985397],
       ...,
       [ 8.71009671,  7.82341023,  2.27985397],
       [ 8.71009671,  7.82341023,  2.37985397],
       [ 8.71009671,  7.82341023,  2.47985397]]),
 'voxel_resolution': array([98, 89, 26]),
 'voxel_size': 0.1}

room

>>> pprint.pprint(data['room'])
{1: {'floor_area': 8.73437848991798,
     'floor_number': 'A',
     'id': 1,
     'location': array([3.53988  , 0.2945975, 1.116783 ]),
     'parent_building': 2,
     'scene_category': 'bathroom',
     'size': array([2.5752  , 2.370445, 2.254074]),
     'volume': 10.48092837735436},
 2: {'floor_area': 9.826049281845457,
     'floor_number': 'A',
     'id': 2,
     'location': array([0.42217  , 2.404948 , 1.1276055]),
     'parent_building': 2,
     'scene_category': 'bathroom',
     'size': array([2.88698 , 2.927084, 2.259769]),
     'volume': 13.84569568093703},
 3: {'floor_area': 11.789331640706246,
     'floor_number': 'A',
     'id': 3,
     'location': array([6.99981 , 0.605225, 1.24227 ]),
     'parent_building': 2,
     'scene_category': 'bedroom',
     'size': array([3.39914, 3.06071, 2.48226]),
     'volume': 23.095719066402776},

 ...

 11: {'floor_area': 7.659161263755288,
      'floor_number': 'A',
      'id': 11,
      'location': array([ 0.2298935, -0.0203395,  1.2285965]),
      'parent_building': 2,
      'scene_category': 'lobby',
      'size': array([2.361493, 1.971701, 2.453287]),
      'volume': 9.14560537370886}}

object

>>> pprint.pprint(data['object'])
{1: {'action_affordance': ['open', 'close', 'cook', 'heat', 'defrost', 'clean'],
     'class_': 'microwave',
     'floor_area': 2.826599475275465,
     'id': 1,
     'location': array([2.83998585, 4.76085063, 1.49223023]),
     'material': ['glass', 'metal'],
     'parent_room': 9,
     'size': array([0.40677453, 1.2802279 , 0.45474387]),
     'surface_coverage': 0.6978848300032634,
     'tactile_texture': None,
     'visual_texture': None,
     'volume': 0.08689193617144757},
 2: {'action_affordance': ['open',
                           'close',
                           'heat',
                           'turn on',
                           'turn off',
                           'clean'],
     'class_': 'oven',
     'floor_area': 3.1440354889034574,
     'id': 2,
     'location': array([2.98861606, 4.78304369, 0.46367262]),
     'material': ['metal', 'glass'],
     'parent_room': 9,
     'size': array([0.7124521 , 1.00192841, 0.94029514]),
     'surface_coverage': 1.3838881855100549,
     'tactile_texture': None,
     'visual_texture': None,
     'volume': 0.32710032579657},
 3: {'action_affordance': ['wash', 'clean'],
     'class_': 'sink',
     'floor_area': 1.7597120848145011,
     'id': 3,
     'location': array([ 4.23522156, -0.57456161,  0.91402512]),
     'material': ['ceramic', None],
     'parent_room': 1,
     'size': array([0.57416017, 0.54074392, 0.17042408]),
     'surface_coverage': 0.2042751198409106,
     'tactile_texture': None,
     'visual_texture': None,
     'volume': 0.014058957460380607},

 ...

 33: {'action_affordance': ['sit at',
                            'lay on',
                            'pick up',
                            'move',
                            'clean',
                            'set',
                            'decorate'],
      'class_': 'dining table',
      'floor_area': 3.473003668596787,
      'id': 33,
      'location': array([4.48357247, 6.70686119, 0.5614044 ]),
      'material': ['wood', None],
      'parent_room': 8,
      'size': array([1.14685995, 0.68447991, 0.6124021 ]),
      'surface_coverage': 1.3836484369355602,
      'tactile_texture': None,
      'visual_texture': None,
      'volume': 0.27629888509653355}}

camera

>>> pprint.pprint(data['camera'])
 1: {'FOV': 1.0489180166567196,
        'id': 1,
        'location': array([6.19820356, 4.94441748, 1.27608538]),
        'modality': 'RGB',
        'name': 'point_0_view_0',
        'parent_room': 11,
        'resolution': array([1024, 1024]),
        'rotation': [1.616633415222168, -0.01483128871768713, 1.8443574905395508]},

 ...

 2863: {'FOV': 1.0376312872786024,
        'id': 2863,
        'location': array([3.41174531, 6.73860884, 1.23510659]),
        'modality': 'RGB',
        'name': 'point_9_view_4',
        'parent_room': 11,
        'resolution': array([1024, 1024]),
        'rotation': [1.9443336725234985,
                     0.0011426351265981793,
                     -1.8368979692459106]}}

panorama

>>> pprint.pprint(data['panorama'])
{'p000001': {'object_class': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16),
             'object_instance': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16)},
 'p000002': {'object_class': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16),
             'object_instance': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16)},

 ...

 'p000026': {'object_class': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16),
             'object_instance': array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int16)}}

msbaines · 2019-12-11T19:20:34Z

Was hoping to load the .npz file in C++ using cnpy, but the .npz contains pickled data which cnpy can't handle. I could potentially handle the pickled data using http://www.picklingtools.com/.

Alternatively, I could do all the processing in python but the python tools for writing a mesh are cumbersome and likely slow. So for now, I think I will write a python script to convert the data I need into a format that can be easily loaded in C++ and then do the processing in C++.

msbaines · 2019-12-11T21:43:52Z

The semantic mask information is located in:

data['building']['object_inst_segmentation']

which an array that is the same size as the number of faces. Each element in the array contains the semantic object_id for that face. If we don't have semantic information for that face, the id is 0.

The format of the array:

>>> type(data['building']['object_inst_segmentation'])
<class 'numpy.ndarray'>
>>> type(data['building']['object_inst_segmentation'][0])
<class 'numpy.ndarray'>
>>> data['building']['object_inst_segmentation'][0]
array([0.])
>>> data['building']['object_inst_segmentation'][0][0]
0.0
>>> type(data['building']['object_inst_segmentation'][0][0])
<class 'numpy.float64'>

msbaines · 2019-12-11T22:10:57Z

We should be able to write out the object ids using the following code:

f = open("out.bin", "wb")
object_ids = data['building']['object_inst_segmentation']
f.write(object_ids.astype(np.int16).tobytes())
f.close()

msbaines · 2019-12-16T18:57:01Z

Bounding box information:

Looking at the object schema:

 33: {'action_affordance': ['sit at',
                            'lay on',
                            'pick up',
                            'move',
                            'clean',
                            'set',
                            'decorate'],
      'class_': 'dining table',
      'floor_area': 3.473003668596787,
      'id': 33,
      'location': array([4.48357247, 6.70686119, 0.5614044 ]),
      'material': ['wood', None],
      'parent_room': 8,
      'size': array([1.14685995, 0.68447991, 0.6124021 ]),
      'surface_coverage': 1.3836484369355602,
      'tactile_texture': None,
      'visual_texture': None,
      'volume': 0.27629888509653355}}

location and size may be the axis-aligned bounding box. This will have to be verified.

msbaines · 2019-12-16T18:59:15Z

Semantics seem to be working with the following transformation:

x1 = x0
y1 = -z0
z1 = y1

It is a two-step process to create a Gibson semantic mesh. First you need to extract the object_id table from the .npz file. Then you can create the semantic_mesh from the extracted ids file and the .obj file the .npz is based on. Addresses: Issue #374

…is missing (#406) With current code state 3dscenegraph semantic annotation files (*.scn) won't load, as our semantic loading pipeline triggers only on *.house files. To enable functionality implemented in #393 and #374 added loading of Gibson Semantics scene if MP3D semantic is missing. To test semantic loading e2e added integration test that will run only when *.scn test data is available.

…ibson semantic scenes (#407) To leverage 3dscenegraph semantic annotation spatial information added support of object's bounding boxes to Gibson semantic scenes. Related to issue #374 and depends on PR #406.

ybgdgh · 2020-09-15T01:43:14Z

Hello, I want to know can we get the room's centers and bounding boxes from habitat in Gibson dataset? I used the 3Dscenegraph as gibson semantics but only get the SemanticObject class. Thanks!

…is missing (facebookresearch#406) With current code state 3dscenegraph semantic annotation files (*.scn) won't load, as our semantic loading pipeline triggers only on *.house files. To enable functionality implemented in facebookresearch#393 and facebookresearch#374 added loading of Gibson Semantics scene if MP3D semantic is missing. To test semantic loading e2e added integration test that will run only when *.scn test data is available.

…ibson semantic scenes (facebookresearch#407) To leverage 3dscenegraph semantic annotation spatial information added support of object's bounding boxes to Gibson semantic scenes. Related to issue facebookresearch#374 and depends on PR facebookresearch#406.

…#430) This is to implement the Encoder-Decoder CNN feature extractor to be used in the EQA baseline implementation, as in facebookresearch#374. Feature extraction from scene images is the first part of each of the subsequent trainers (VQA, PACMAN) in the EQA implementation. Implementation based on EmbodiedQA, Das et al, CVPR 2018 (paper, code)

msbaines assigned msbaines and msavva Dec 10, 2019

msbaines mentioned this issue Dec 17, 2019

[datatool] add support for Gibson semantic mesh generation #385

Merged

11 tasks

This was referenced Dec 28, 2019

[esp/scene] Added loading of Gibson Semantics scene if MP3D semantic is missing, integration test #406

Merged

[esp/scene] Added support of object's centers and bounding boxes to Gibson semantic scenes #407

Merged

ybgdgh unassigned msavva Sep 15, 2020

aclegg3 added the enhancement New feature or request label Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for importing semantics from https://3dscenegraph.stanford.edu for use with gibson dataset #374

Add support for importing semantics from https://3dscenegraph.stanford.edu for use with gibson dataset #374

msbaines commented Dec 10, 2019

msbaines commented Dec 10, 2019

msbaines commented Dec 10, 2019 •

edited

Loading

msbaines commented Dec 11, 2019

msbaines commented Dec 11, 2019

msbaines commented Dec 11, 2019

msbaines commented Dec 11, 2019

msbaines commented Dec 16, 2019

msbaines commented Dec 16, 2019

ybgdgh commented Sep 15, 2020

Add support for importing semantics from https://3dscenegraph.stanford.edu for use with gibson dataset #374

Add support for importing semantics from https://3dscenegraph.stanford.edu for use with gibson dataset #374

Comments

msbaines commented Dec 10, 2019

🚀 Feature

msbaines commented Dec 10, 2019

msbaines commented Dec 10, 2019 • edited Loading

msbaines commented Dec 11, 2019

building

room

object

camera

panorama

msbaines commented Dec 11, 2019

msbaines commented Dec 11, 2019

msbaines commented Dec 11, 2019

msbaines commented Dec 16, 2019

msbaines commented Dec 16, 2019

ybgdgh commented Sep 15, 2020

msbaines commented Dec 10, 2019 •

edited

Loading