You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm evaluating the model on a relatively large dataset (single question, single answer). I was able to fine-tune the Bunny-1.1-Llama-3-8B-V model using one of the scripts provided. What is the best strategy to implement batch inference?
The text was updated successfully, but these errors were encountered:
Sorry for that we don't support batch inference currently. You may split the dataset into multiple parts and launch a model on each GPU, like evaluating on VQA, GQA and SEED-Bench.
However, we failed to set the attention_mask of left-padding tokens to be 0. So the attention_mask of inputs are all 1 and the outputs may be a little different from single-sample inference.
Hi!
I'm evaluating the model on a relatively large dataset (single question, single answer). I was able to fine-tune the Bunny-1.1-Llama-3-8B-V model using one of the scripts provided. What is the best strategy to implement batch inference?
The text was updated successfully, but these errors were encountered: