Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to finetune Osprey on RefCOCOg? #28

Open
Glupapa opened this issue Mar 26, 2024 · 1 comment
Open

How to finetune Osprey on RefCOCOg? #28

Glupapa opened this issue Mar 26, 2024 · 1 comment

Comments

@Glupapa
Copy link

Glupapa commented Mar 26, 2024

Hi! Thanks for the great work!

Could you share any configs on fine-tuning Osprey on RefCOCOg dataset? I am trying to follow your work and reproduce the results on it, what's the starting checkpoint and the prompt template? It would be very appreciated if any fine-tuning config could be shared.
Thank you!

@CircleRadon
Copy link
Owner

CircleRadon commented Mar 27, 2024

Hi, @Glupapa
The starting checkpoint is our final model Osprey-7b, and the prompt template is the same as RefCOCO as in https://github.com/CircleRadon/Osprey/blob/ca9f26dbd9a0907d8ff686588a394fa897b60828/osprey/datasets/stage2_data.py#L256C26-L262C1
The config is as follows,

#!/bin/bash
export PYTHONPATH=`pwd`:$PYTHONPATH

deepspeed --include localhost:0,1,2,3 llava/train/train_mem.py \
    --deepspeed ./scripts/zero2.json \
    --model_name_or_path ./Osprey-7b \
    --dataset_config ./osprey/configs/finetune.json \
    --version v1 \
    --vision_tower laion2b_s29b_b131k_ft_soup.bin \
    --mm_projector_type mlp2x_gelu \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --group_by_modality_length True \
    --bf16 True \
    --output_dir './exp/finetune' \
    --num_train_epochs 1 \
    --per_device_train_batch_size 1\
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 3000 \
    --save_total_limit 1 \
    --learning_rate 5e-6 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --report_to "none" \
    --group_by_modality_length False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants