Contrastive-VisionVAE-Follower is a model used for multi-modal task called Vision-and-Language Navigation (VLN).
-
Updated
Jan 24, 2024 - C++
Contrastive-VisionVAE-Follower is a model used for multi-modal task called Vision-and-Language Navigation (VLN).
Official implementation of the NAACL 2024 paper "Navigation as Attackers Wish? Towards Building Robust Embodied Agents under Federated Learning"
Official repository of "Mind the Error! Detection and Localization of Instruction Errors in Vision-and-Language Navigation". We present the first dataset - R2R-IE-CE - to benchmark instructions errors in VLN. We then propose a method, IEDL.
A list of research papers on knowledge-enhanced multimodal learning
Fast-Slow Test-time Adaptation for Online Vision-and-Language Navigation
LACMA: Language-Aligning Contrastive Learning with Meta-Actions for Embodied Instruction Following
Code for 'Chasing Ghosts: Instruction Following as Bayesian State Tracking' published at NeurIPS 2019
[ECCV 2022] Official pytorch implementation of the paper "FedVLN: Privacy-preserving Federated Vision-and-Language Navigation"
Code for ORAR Agent for Vision and Language Navigation on Touchdown and map2seq
Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty
Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations
Repository for Vision-and-Language Navigation via Causal Learning (Accepted by CVPR 2024)
Code and data of the Fine-Grained R2R Dataset proposed in the EMNLP 2021 paper Sub-Instruction Aware Vision-and-Language Navigation
Code of the NeurIPS 2021 paper: Language and Visual Entity Relationship Graph for Agent Navigation
Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"
Code and Data of the CVPR 2022 paper: Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation
A curated list of research papers in Vision-Language Navigation (VLN)
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
Add a description, image, and links to the vision-and-language-navigation topic page so that developers can more easily learn about it.
To associate your repository with the vision-and-language-navigation topic, visit your repo's landing page and select "manage topics."