Skip to content

cv516Buaa/OVGNet

Repository files navigation

OVGNet: An Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping


Meng Li · Qi Zhao · Shuchang Lyu · Chunlei Wang · Yujing Ma · Guangliang Cheng · Chenguang Yang

Highlight!!!!

This repo is the implementation of "OVGNet: An Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping". we refer to Vision-Language-Grasping, GroundingDINO, VL-Grasp. Many thanks to these excellent repos.

Demo Setting

  • Novel indicates the unseen objects in training.
  • Base denotes the seen objects in training.
  • Battery and power drill are novel classes, which belong to hard task.
  • Apple and pear are base classes, which belong to simple task.

Demo Video

Grasping_demo.mp4

Dataset

  • OVGrasping follows GroundingDINO data format.
  • The OVGrapsing dataset comprises 117 categories and 63,385 instances.
  • Instances are sourced from three distinct origins: RoboRefIt, GraspNet, simulated environment.
  • The dataset is divided into two categories: the base category consists 51,857 instances, and the novel category comprises 11,528 instances.

Installation

  • Ubantu==18.04
  • Python==3.9
  • Torch==1.11, Torchvision==0.12.0
  • CUDA==11.3
  • checkpoint==OVGANet
  • assets==assets

please add the assets into OVGNet folder
please ensure the CUDA version is 11.3

conda create -n OVGNet python=3.9
conda activate OVGNet
pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
cd /OVGNet/
pip install -r requirments.txt
cd graspnet/graspnet/pointnet2
python setup.py install
cd graspnet/graspnet/knn
python setup.py install
cd groundingdino
pip install -e .

Run

cd /OVGNet/
python test.py --testing_case_dir ./test_cases/simple/apple --pretrain ./checkpoint/OVGANet

Test on OVGrasping

cd /OVGNet/test_vg/
python test_vg.py --c ./config/cfg_odvg.py --datasets ./config/datasets_vg_example.json --pretrain_model_path  OVGNet/checkpoint/OVGANet

Cite

@InProceedings{Li_2024_IROS,
    author = {Li Meng and Zhao Qi and Lyu Shuchang and Wang Chunlei and Ma Yujing and Cheng Guangliang and Yang Chenguang},
    title = {OVGNet: A Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping},
    year = {2024},
    eprint = {2407.13175},
    archivePrefix = {arXiv},
    primaryClass = {cs.RO},
    url = {https://arxiv.org/abs/2407.13175}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published