-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VLN Task & Dataset with shortest path example and tests. #233
Conversation
…ction. Assumes dataset episodes have 1 instruction and 1 trajectory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments. Will dive in more later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome progress! VLN task is a one of the most anticipated tasks in Habitat. Left a part of the comments.
configs/datasets/vln/mp3d_r2r.yaml
Outdated
DATASET: | ||
TYPE: R2RVLN-v1 | ||
SPLIT: train | ||
DATA_PATH: data/R2R/habitat_R2R_{split}.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To follow naming dataset files naming convention I would advice next naming:
DATA_PATH: data/R2R/habitat_R2R_{split}.json | |
DATA_PATH: data/datasets/r2r/{split}/{split}.json.gz |
configs/tasks/vln_r2r.yaml
Outdated
HABITAT_SIM_V0: | ||
GPU_DEVICE_ID: 0 | ||
RGB_SENSOR: | ||
WIDTH: 512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that expected that RGB resolution is different from Depth resolution?
configs/datasets/vln/mp3d_r2r.yaml
Outdated
TYPE: R2RVLN-v1 | ||
SPLIT: train | ||
DATA_PATH: data/R2R/habitat_R2R_{split}.json | ||
SCENES_DIR: data/mp3d/scenes/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default path is expected to be:
SCENES_DIR: data/mp3d/scenes/ | |
SCENES_DIR: "data/scene_datasets/" |
configs/tasks/vln_r2r.yaml
Outdated
TYPE: R2RVLN-v1 | ||
SPLIT: train | ||
DATA_PATH: data/datasets/R2R/hb_R2R_{split}.json | ||
SCENES_DIR: data/scene_datasets/mp3d/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment regarding dataset paths.
configs/tasks/vln_r2r.yaml
Outdated
HEIGHT: 256 | ||
TASK: | ||
TYPE: VLN-v0 | ||
SUCCESS_DISTANCE: 2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 meters are acceptable success distance? Can be pretty large threshold.
DATASET: | ||
TYPE: R2RVLN-v1 | ||
SPLIT: val_seen | ||
DATA_PATH: data/datasets/R2R/hb_R2R_{split}.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataset files paths and naming.
HEIGHT: 256 | ||
TASK: | ||
TYPE: VLN-v0 | ||
SUCCESS_DISTANCE: 3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The success distance is set to 3m here which is the distance defined in the original VLN paper and is used by most papers in this space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are really close, left some final comments. Is it possible to add link to data download into README
? Thank you!
configs/tasks/vln_r2r.yaml
Outdated
TYPE: VLN-v0 | ||
SUCCESS_DISTANCE: 3.0 | ||
SENSORS: ['INSTRUCTION_SENSOR'] | ||
POSSIBLE_ACTIONS: ['MOVE_FORWARD', 'TURN_LEFT', 'TURN_RIGHT', 'STOP'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth to make 'STOP'
0 action, as list of other actions can expand.
test/test_r2r_vln.py
Outdated
dataset_config | ||
): | ||
pytest.skip( | ||
"Please download Matterport3D R2R dataset to " "data folder." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Please download Matterport3D R2R dataset to " "data folder." | |
"Please download Matterport3D R2R dataset to data folder." |
test/test_r2r_vln.py
Outdated
follower = ShortestPathFollower( | ||
env.habitat_env.sim, goal_radius=0.5, return_one_hot=False | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,120 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding relevant tests.
habitat/tasks/vln/vln.py
Outdated
start_rotation: numpy ndarray with 4 entries for (x, y, z, w) | ||
elements of unit quaternion (versor) representing agent 3D | ||
orientation. | ||
instruction: single instruction guide to goal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instruction: single instruction guide to goal. | |
instruction: single natural language instruction guide to goal. |
|
||
|
||
@registry.register_task(name="VLN-v0") | ||
class VLNTask(NavigationTask): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extensive doc string should be added with concise description of the task.
CONTENT_SCENES_PATH_FIELD = "content_scenes_path" | ||
DEFAULT_SCENE_PATH_PREFIX = "data/scene_datasets/mp3d/" | ||
|
||
R2R_TRAIN_EPISODES = 10837 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constants are never used.
output_im, observations["instruction"]["text"] | ||
) | ||
images.append(output_im) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all functions above required? Can we import and reuse functions from other examples or util functions?
return avg_metrics | ||
|
||
|
||
def main(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we would like to have same launcher for different task benchmarks. It should be possible to incorporate task config into examples/benchmark.py
and make it support VLN as well. But most probably one more registry needed for supported agents/baselines. Let's create follow up issue with merging benchmark.py
and vln_benchmark.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @koshyanand and @jacobkrantz, really like how you figured out a lot details from existing code. Some minor comments otherwise ready to merge.
os.makedirs(IMAGE_DIR) | ||
|
||
|
||
def append_text_to_image(orig_img, text): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move to habitat.utils.visualizations.utils
.
|
||
deserialized = json.loads(json_str) | ||
|
||
# Done for the serialization test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason that we need this if? Why we don't have deserialized["instruction_vocab"]["word_list"]
in released version of dataset?
habitat/tasks/vln/vln.py
Outdated
trajectory_id: id of ground truth trajectory path. | ||
goals: relevant goal object/room. | ||
""" | ||
path: List[List[float]] = attr.ib( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the rotation in ShortestPathPoint
optional sounds good. You can re-use visualization we have for shortest path already.
@@ -0,0 +1,92 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can include your names as owners/maintainers of the VLN task here as well as reflect your contribution in README section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added our names in this file. For the README section, we can wait until we have a paper (as per discussion).
configs/tasks/vln_r2r.yaml
Outdated
RGB_SENSOR: | ||
WIDTH: 256 | ||
HEIGHT: 256 | ||
HFOV: 45 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 45 degrees? That seems incredibly narrow
configs/datasets/vln/mp3d_r2r.yaml
Outdated
DATASET: | ||
TYPE: R2RVLN-v1 | ||
SPLIT: train | ||
DATA_PATH: "data/datasets/vln/r2r/v1/{split}/{split}.json.gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with our dataset paths let change path to everywhere:
DATA_PATH: "data/datasets/vln/r2r/v1/{split}/{split}.json.gz" | |
DATA_PATH: "data/datasets/vln/mp3d/r2r/v1/{split}/{split}.json.gz" |
R2R_VAL_SEEN_EPISODES = 781 | ||
|
||
|
||
@registry.register_dataset(name="R2RVLN-v1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should register it MP3DR2RVLN-v1
or change doc string not attach it to MP3D only. From my experience this class can be scene dataset agnostic.
@registry.register_dataset(name="R2RVLN-v1") | |
@registry.register_dataset(name="R2RVLN-v1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the doc string as the class is not tied to MP3D.
@erikwijmans @mathfac back to you! |
@jacobkrantz and @koshyanand thank you for the contribution of such popular Embodied AI task and following all the comments. |
This test is randomly failing (#233). Try to make it deterministic so that it always fails or always passes.
…earch#233) * Add VLN Task * Add reading of Room-to-Room Dataset * Add VLN Instruction Sensor * Added shortest path follower * Added benchmark for Random, ForwardOnly, RandomForward & GoalFollower agent * Represent instructions as list of tokens * Added vocabulary API to R2R dataset
Motivation and Context
Add VLN Task
Add reading of Room-to-Room Dataset
Add VLN Instruction Sensor
Added shortest path follower
Added benchmark for Random, ForwardOnly, RandomForward & GoalFollower agent
Represent instructions as list of tokens
Added vocabulary API to R2R dataset
How Has This Been Tested
Config test
Dataset size and serialization test
Instruction Sensor test & test to see if the sensor instruction matches the current instruction
Types of changes
Checklist