Prompt Preparation of Kosmos-2 Object Detection Fine-tuning #1576

KevinHooah · 2024-06-14T18:36:28Z

Describe
Model I am using: Kosmos-2

Hi! I am working on fine-tuning the Kosmos-2 model for my own application. In short, the target may appear multiple times in the image (e.g., cars in a parking lot), and the cases can be there is only one target in the image as well.

Right now, I am preparing the dataset like following:

    if len(bboxes) > 1:
        text = "<grounding>" + "<phrase> several {target}s</phrase>"
    else:
        text = "<grounding>" + "<phrase> a {target}</phrase>"
    data_list.append({'bbox': [bboxes], 'image': image, 'text': text})

In this code, the bboxes is the human annotated bounding box, the format is list of list of tuples. The [target] is the placeholder for my target (which is a noun word.)

When I train the model with such prompts, it still output one and only one bounding box for the target, even there are multiple targets in the image.

For example, let's say the target is "car", the model will only output a bounding box for one of multiple cars in the image.

May I ask how can I solve this issue?

Note
"Car" is an random example, the target is something we believe it's rare in the Kosmos-2 pre-training data.

The text was updated successfully, but these errors were encountered:

pengzhiliang · 2024-06-20T13:13:40Z

Hello, as you know, we haven't fine-tuned this model on any specific object detection dataset, so we cannot control how many bboxes the model will generate; it could be one or multiple.
Perhaps you can try some different prompts:
Describe this image in detailed. a {target} / several {targets}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prompt Preparation of Kosmos-2 Object Detection Fine-tuning #1576

Prompt Preparation of Kosmos-2 Object Detection Fine-tuning #1576

KevinHooah commented Jun 14, 2024 •

edited

Loading

pengzhiliang commented Jun 20, 2024

Prompt Preparation of Kosmos-2 Object Detection Fine-tuning #1576

Prompt Preparation of Kosmos-2 Object Detection Fine-tuning #1576

Comments

KevinHooah commented Jun 14, 2024 • edited Loading

pengzhiliang commented Jun 20, 2024

KevinHooah commented Jun 14, 2024 •

edited

Loading