Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning and dataset formatting guidelines #2

Closed
hinsonan opened this issue Apr 29, 2024 · 2 comments
Closed

Finetuning and dataset formatting guidelines #2

hinsonan opened this issue Apr 29, 2024 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@hinsonan
Copy link

Very cool work and congrats on what you have accomplished. Wanted to know if you all had plans to release a finetuning guide and how to format datasets

@machuofan
Copy link
Collaborator

Hi there, thanks for your interest in our work. Here are some tips you may follow to finetune the model on customized datasets:

  1. Format your data. There are various dataset templates under groma/data/datasets. For example, you can refer to refcoco_rec.py to format REC data, visual_genome.py for region captioning, llava.py for conversation, and so on. BTW, don't forget to register the new dataset in groma/data/build.py.
  2. Download the pretrained checkpoint groma-7b-pretrain.
  3. Config groma/data/configs/vl_finetune.py and scripts/vl_finetune.sh, then run
    bash scripts/vl_finetune.sh {path_to_groma_7b_pretrain_ckpt} {output_dir}.

@machuofan machuofan added the documentation Improvements or additions to documentation label Apr 30, 2024
@machuofan machuofan changed the title Do you Plan to release Finetuning and dataset formatting guides? Finetuning and dataset formatting guidelines Apr 30, 2024
@hinsonan
Copy link
Author

hinsonan commented May 1, 2024

thank you for your response. Perhaps if i have some time i can update documentation and provide a fine-tuning section. Someone else may be able to get to it sooner than me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants