[Bug Fix] fix qa pipeline tensor to numpy #31585

jiqing-feng · 2024-06-25T06:25:15Z

This PR fixed the error for question-answering pipeline, the error could be reproduced by

from transformers import pipeline
pipe = pipeline("question-answering", model="hf-internal-testing/tiny-random-bert")
question = "What's my name?"
context = "My Name is Sasha and I live in Lyon."
pipe(question, context)

Traceback:

Traceback (most recent call last):
  File "test_qa.py", line 5, in <module>
    pipe(question, context)
  File "/home/jiqingfe/miniconda3/envs/ccl/lib/python3.8/site-packages/transformers/pipelines/question_answering.py", line 393, in __call__
    return super().__call__(examples[0], **kwargs)
  File "/home/jiqingfe/miniconda3/envs/ccl/lib/python3.8/site-packages/transformers/pipelines/base.py", line 1235, in __call__
    return next(
  File "/home/jiqingfe/miniconda3/envs/ccl/lib/python3.8/site-packages/transformers/pipelines/pt_utils.py", line 125, in __next__
    processed = self.infer(item, **self.params)
  File "/home/jiqingfe/miniconda3/envs/ccl/lib/python3.8/site-packages/transformers/pipelines/question_answering.py", line 546, in postprocess                                                    starts, ends, scores, min_null_score = select_starts_ends(
  File "/home/jiqingfe/miniconda3/envs/ccl/lib/python3.8/site-packages/transformers/pipelines/question_answering.py", line 124, in select_starts_ends
    undesired_tokens = undesired_tokens & attention_mask
TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

jiqing-feng · 2024-06-25T07:35:19Z

I found this problem came from numpy, in python3.8, numpy will cast int to float:

So I suggest that we can use p_mask.numpy() instead of np.array(p_mask)

amyeroberts · 2024-06-25T11:22:44Z

src/transformers/pipelines/question_answering.py

@@ -118,7 +118,7 @@ def select_starts_ends(
        max_answer_len (`int`): Maximum size of the answer to extract from the model's output.
    """
    # Ensure padded tokens & question tokens cannot belong to the set of candidate answers.
-    undesired_tokens = np.abs(np.array(p_mask) - 1)
+    undesired_tokens = np.abs(p_mask.numpy() - 1)


Does this still work if you run the pipeline in jax?

from transformers import pipeline pipe = pipeline("question-answering", model="hf-internal-testing/tiny-random-bert", framework="flax") question = "What's my name?" context = "My Name is Sasha and I live in Lyon."

It will raise a value error.
ValueError: Pipeline cannot infer suitable model classes from hf-internal-testing/tiny-random-bert.

Besides, temsor.numpy() has been already used in other pipelines like ASR

OK, yes, looking into it we seem so assume either tf or pt everywhere in the pipeline, so even though I think this would break things for jax tensors it's not something we need to take account of at the moment. Thanks for testing!

jiqing-feng · 2024-07-03T08:37:02Z

Hi @amyeroberts , could you take a look at this PR? I am waiting for your response, thx!

LysandreJik · 2024-07-03T09:32:42Z

Hey @jiqing-feng! I'm trying to reproduce the issue but failing at doing so with python 3.8.18 and numpy 1.24.4.

>>> import torch
>>> import numpy as np
>>> a = torch.tensor([1,2,3], dtype=torch.int64)
>>> a
tensor([1, 2, 3])
>>> np.array(a)
array([1, 2, 3])
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=8, micro=18, releaselevel='final', serial=0)

What's your torch version?

jiqing-feng · 2024-07-03T10:26:31Z

Hey @jiqing-feng! I'm trying to reproduce the issue but failing at doing so with python 3.8.18 and numpy 1.24.4.
>>> import torch
>>> import numpy as np
>>> a = torch.tensor([1,2,3], dtype=torch.int64)
>>> a
tensor([1, 2, 3])
>>> np.array(a)
array([1, 2, 3])
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=8, micro=18, releaselevel='final', serial=0)
What's your torch version?

torch 2.3.0+cpu

jiqing-feng · 2024-07-03T10:30:34Z

Hey @jiqing-feng! I'm trying to reproduce the issue but failing at doing so with python 3.8.18 and numpy 1.24.4.
>>> import torch
>>> import numpy as np
>>> a = torch.tensor([1,2,3], dtype=torch.int64)
>>> a
tensor([1, 2, 3])
>>> np.array(a)
array([1, 2, 3])
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=8, micro=18, releaselevel='final', serial=0)
What's your torch version?
torch 2.3.0+cpu

I just checked that torch 2.3.1+cpu fixed this issue; you can close this PR if you think there is no need to do this change. BTW, I suppose the change will not break anything, and it's more common. Thx!

amyeroberts · 2024-07-11T21:22:03Z

@jiqing-feng Thanks for investigating across the different pytorch versions. If the fix it only in later versions, then this is a change we'd still want as we officially support torch >= 1.11

amyeroberts

Thanks for fixing!

* fix qa pipeline * fix tensor to numpy

jiqing-feng marked this pull request as ready for review June 25, 2024 06:26

jiqing-feng changed the title ~~fix qa pipeline~~ [Bug Fix] fix qa pipeline tensor to numpy Jun 25, 2024

amyeroberts reviewed Jun 25, 2024

View reviewed changes

jiqing-feng added 2 commits June 25, 2024 10:19

fix qa pipeline

904a13d

fix tensor to numpy

40754c5

amyeroberts approved these changes Jul 11, 2024

View reviewed changes

amyeroberts merged commit aec1ca3 into huggingface:main Jul 11, 2024
18 checks passed

amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jul 19, 2024

[Bug Fix] fix qa pipeline tensor to numpy (huggingface#31585)

1dd2417

* fix qa pipeline * fix tensor to numpy

MHRDYN7 pushed a commit to MHRDYN7/transformers that referenced this pull request Jul 23, 2024

[Bug Fix] fix qa pipeline tensor to numpy (huggingface#31585)

2a9475e

* fix qa pipeline * fix tensor to numpy

zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request Jul 24, 2024

[Bug Fix] fix qa pipeline tensor to numpy (huggingface#31585)

423f242

* fix qa pipeline * fix tensor to numpy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Fix] fix qa pipeline tensor to numpy #31585

[Bug Fix] fix qa pipeline tensor to numpy #31585

jiqing-feng commented Jun 25, 2024 •

edited

Loading

jiqing-feng commented Jun 25, 2024

amyeroberts Jun 25, 2024 •

edited

Loading

jiqing-feng Jun 26, 2024

jiqing-feng Jun 26, 2024 •

edited

Loading

amyeroberts Jul 11, 2024

jiqing-feng commented Jul 3, 2024

LysandreJik commented Jul 3, 2024

jiqing-feng commented Jul 3, 2024

jiqing-feng commented Jul 3, 2024

amyeroberts commented Jul 11, 2024

amyeroberts left a comment

[Bug Fix] fix qa pipeline tensor to numpy #31585

[Bug Fix] fix qa pipeline tensor to numpy #31585

Conversation

jiqing-feng commented Jun 25, 2024 • edited Loading

jiqing-feng commented Jun 25, 2024

amyeroberts Jun 25, 2024 • edited Loading

Choose a reason for hiding this comment

jiqing-feng Jun 26, 2024

Choose a reason for hiding this comment

jiqing-feng Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

amyeroberts Jul 11, 2024

Choose a reason for hiding this comment

jiqing-feng commented Jul 3, 2024

LysandreJik commented Jul 3, 2024

jiqing-feng commented Jul 3, 2024

jiqing-feng commented Jul 3, 2024

amyeroberts commented Jul 11, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

jiqing-feng commented Jun 25, 2024 •

edited

Loading

amyeroberts Jun 25, 2024 •

edited

Loading

jiqing-feng Jun 26, 2024 •

edited

Loading