Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of form feed symbol with BLEU results in error #601

Open
lowlypalace opened this issue Jun 11, 2024 · 0 comments
Open

Evaluation of form feed symbol with BLEU results in error #601

lowlypalace opened this issue Jun 11, 2024 · 0 comments

Comments

@lowlypalace
Copy link

lowlypalace commented Jun 11, 2024

Hi, I'm generating LLM sequences with some of the HF models such as pythia-1.4b. Some of my generations result in a sequence consisting only of form feed token, which is 12th ASCII character.

from evaluate import load

bleu = load("bleu")

prediction = "hello"
reference = chr(12)

bleu_score = bleu.compute(
    predictions=[prediction], references=[[reference]]
)["bleu"]

The following code results in an error:

ZeroDivisionError                         Traceback (most recent call last)
[<ipython-input-1-8625f8bf1df7>](https://localhost:8080/#) in <cell line: 8>()
      6 reference = chr(12)
      7 
----> 8 bleu_score = bleu.compute(
      9     predictions=[prediction], references=[[reference]]
     10 )["bleu"]

2 frames
[/usr/local/lib/python3.10/dist-packages/evaluate/module.py](https://localhost:8080/#) in compute(self, predictions, references, **kwargs)
    465             inputs = {input_name: self.data[input_name] for input_name in self._feature_names()}
    466             with temp_seed(self.seed):
--> 467                 output = self._compute(**inputs, **compute_kwargs)
    468 
    469             if self.buf_writer is not None:

[~/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--bleu/9e0985c1200e367cce45605ce0ecb5ede079894e0f24f54613fca08eeb8aff76/bleu.py](https://localhost:8080/#) in _compute(self, predictions, references, tokenizer, max_order, smooth)
    120         references = [[tokenizer(r) for r in ref] for ref in references]
    121         predictions = [tokenizer(p) for p in predictions]
--> 122         score = compute_bleu(
    123             reference_corpus=references, translation_corpus=predictions, max_order=max_order, smooth=smooth
    124         )

[~/.cache/huggingface/modules/evaluate_modules/metrics/evaluate-metric--bleu/9e0985c1200e367cce45605ce0ecb5ede079894e0f24f54613fca08eeb8aff76/nmt_bleu.py](https://localhost:8080/#) in compute_bleu(reference_corpus, translation_corpus, max_order, smooth)
    101     geo_mean = 0
    102 
--> 103   ratio = float(translation_length) / reference_length
    104 
    105   if ratio > 1.0:

ZeroDivisionError: float division by zero

The expected behaviour would be that the score should still be computed for this character even though this is a non-printable character. I believe this will happen with other non-printable characters. Is this an intended behaviour?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant