修改或融合视觉模块 #99

Why0912 · 2024-07-01T09:01:20Z

请问是否支持修改视觉模块或融合多个主干的视觉表征？
如果进行修改或融合，是否需要重新进行pre_train来获得相应的projector权重？
或是如何对projector进行修改？

Isaachhh · 2024-07-06T10:37:28Z

For another vision tower or projector, you can import what you like. Pay attention to multimodal_encoder and multimodal_projector. You need to add the code of class and modify the build function.

For combining multiple vision features, you also need to modify the architecture of Bunny (something like vision_tower_list) and encode_image function and etc.

Generally, you need to pre-train and fine-tune by yourself. Under some circumstances, you may start from our released weights.

Why0912 · 2024-07-15T02:41:45Z

感谢回复，另外问一下pre_train大概需要怎样的算力资源？

Isaachhh · 2024-07-15T02:47:19Z

#90

We always use 8*A100.

Isaachhh · 2024-08-06T03:31:10Z

Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.

Isaachhh closed this as completed Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

修改或融合视觉模块 #99

修改或融合视觉模块 #99

Why0912 commented Jul 1, 2024

Isaachhh commented Jul 6, 2024

Why0912 commented Jul 15, 2024

Isaachhh commented Jul 15, 2024 •

edited

Loading

Isaachhh commented Aug 6, 2024

修改或融合视觉模块 #99

修改或融合视觉模块 #99

Comments

Why0912 commented Jul 1, 2024

Isaachhh commented Jul 6, 2024

Why0912 commented Jul 15, 2024

Isaachhh commented Jul 15, 2024 • edited Loading

Isaachhh commented Aug 6, 2024

Isaachhh commented Jul 15, 2024 •

edited

Loading