You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
python others/test_diff_vlm/InternLM_XComposer.py
Set max length to 16384
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.47s/it]Traceback (most recent call last):
File "/mnt/data/mmyu/eqa/explore-eqa/others/test_diff_vlm/InternLM_XComposer.py", line 19, in
response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 594, in chat
inputs, im_mask, _ = self.interleav_wrap_chat(query, image, history=history, meta_instruction=meta_instruction, hd_num=hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 273, in interleav_wrap_chat
img = self.encode_img(image[idx], hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 164, in encode_img
image = Image_transform(image, hd_num = hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 46, in Image_transform
img = padding_336(img, 560)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 24, in padding_336
b = transforms.functional.pad(b, [left_padding, top_padding, right_padding, bottom_padding], fill=[255,255,255])
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/functional.py", line 516, in pad
return F_pil.pad(img, padding=padding, fill=fill, padding_mode=padding_mode)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 175, in pad
opts = _parse_fill(fill, img, name="fill")
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 271, in _parse_fill
raise ValueError(msg.format(len(fill), num_channels))
ValueError: The number of elements in 'fill' does not match the number of channels of the image (3 != 4)
The text was updated successfully, but these errors were encountered:
Same issue here , but I was inspired by this
Then I changed the channel of my file to RGB using the following code and finally ran example_chat.py successfully
python others/test_diff_vlm/InternLM_XComposer.py
Set max length to 16384
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.47s/it]Traceback (most recent call last):
File "/mnt/data/mmyu/eqa/explore-eqa/others/test_diff_vlm/InternLM_XComposer.py", line 19, in
response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 594, in chat
inputs, im_mask, _ = self.interleav_wrap_chat(query, image, history=history, meta_instruction=meta_instruction, hd_num=hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 273, in interleav_wrap_chat
img = self.encode_img(image[idx], hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 164, in encode_img
image = Image_transform(image, hd_num = hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 46, in Image_transform
img = padding_336(img, 560)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 24, in padding_336
b = transforms.functional.pad(b, [left_padding, top_padding, right_padding, bottom_padding], fill=[255,255,255])
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/functional.py", line 516, in pad
return F_pil.pad(img, padding=padding, fill=fill, padding_mode=padding_mode)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 175, in pad
opts = _parse_fill(fill, img, name="fill")
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 271, in _parse_fill
raise ValueError(msg.format(len(fill), num_channels))
ValueError: The number of elements in 'fill' does not match the number of channels of the image (3 != 4)
The text was updated successfully, but these errors were encountered: