Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修改或融合视觉模块 #99

Closed
Why0912 opened this issue Jul 1, 2024 · 4 comments
Closed

修改或融合视觉模块 #99

Why0912 opened this issue Jul 1, 2024 · 4 comments

Comments

@Why0912
Copy link

Why0912 commented Jul 1, 2024

请问是否支持修改视觉模块或融合多个主干的视觉表征?
如果进行修改或融合,是否需要重新进行pre_train来获得相应的projector权重?
或是如何对projector进行修改?

@Isaachhh
Copy link
Collaborator

Isaachhh commented Jul 6, 2024

For another vision tower or projector, you can import what you like. Pay attention to multimodal_encoder and multimodal_projector. You need to add the code of class and modify the build function.

For combining multiple vision features, you also need to modify the architecture of Bunny (something like vision_tower_list) and encode_image function and etc.

Generally, you need to pre-train and fine-tune by yourself. Under some circumstances, you may start from our released weights.

@Why0912
Copy link
Author

Why0912 commented Jul 15, 2024

感谢回复,另外问一下pre_train大概需要怎样的算力资源?

@Isaachhh
Copy link
Collaborator

Isaachhh commented Jul 15, 2024

#90

We always use 8*A100.

@Isaachhh
Copy link
Collaborator

Isaachhh commented Aug 6, 2024

Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.

@Isaachhh Isaachhh closed this as completed Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants