运行demo报错 #358

yusirhhh · 2024-07-08T10:33:39Z

python others/test_diff_vlm/InternLM_XComposer.py
Set max length to 16384
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:04<00:00, 1.47s/it]Traceback (most recent call last):
File "/mnt/data/mmyu/eqa/explore-eqa/others/test_diff_vlm/InternLM_XComposer.py", line 19, in
response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 594, in chat
inputs, im_mask, _ = self.interleav_wrap_chat(query, image, history=history, meta_instruction=meta_instruction, hd_num=hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 273, in interleav_wrap_chat
img = self.encode_img(image[idx], hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/modeling_internlm_xcomposer2.py", line 164, in encode_img
image = Image_transform(image, hd_num = hd_num)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 46, in Image_transform
img = padding_336(img, 560)
File "/home/mmyu/.cache/huggingface/modules/transformers_modules/internlm-xcomposer2d5-7b/ixc_utils.py", line 24, in padding_336
b = transforms.functional.pad(b, [left_padding, top_padding, right_padding, bottom_padding], fill=[255,255,255])
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/functional.py", line 516, in pad
return F_pil.pad(img, padding=padding, fill=fill, padding_mode=padding_mode)
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 175, in pad
opts = _parse_fill(fill, img, name="fill")
File "/home/mmyu/anaconda3/envs/eval_cog/lib/python3.9/site-packages/torchvision/transforms/_functional_pil.py", line 271, in _parse_fill
raise ValueError(msg.format(len(fill), num_channels))
ValueError: The number of elements in 'fill' does not match the number of channels of the image (3 != 4)

simplelifetime · 2024-07-08T12:21:01Z

Same problem

MarcoFerreiraPerson · 2024-07-09T22:38:58Z

same problem

zTaoplus · 2024-07-10T09:02:10Z

Same issue here , but I was inspired by this
Then I changed the channel of my file to RGB using the following code and finally ran example_chat.py successfully

from PIL import Image
img = Image.open('test.png')
img = img.convert("RGB")
img.save("test-rgb.png")

Hope this helps.

mm-assistant bot assigned myownskyW7 Jul 8, 2024

panzhang0212 assigned LightDXY Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

运行demo报错 #358

运行demo报错 #358

yusirhhh commented Jul 8, 2024

simplelifetime commented Jul 8, 2024

MarcoFerreiraPerson commented Jul 9, 2024

zTaoplus commented Jul 10, 2024 •

edited

Loading

运行demo报错 #358

运行demo报错 #358

Comments

yusirhhh commented Jul 8, 2024

simplelifetime commented Jul 8, 2024

MarcoFerreiraPerson commented Jul 9, 2024

zTaoplus commented Jul 10, 2024 • edited Loading

zTaoplus commented Jul 10, 2024 •

edited

Loading