Toxicity Measurement #262

sashavor · 2022-08-18T19:53:52Z

Initial draft of the toxicity metric -- would love your thoughts, @mathemakitten and @lvwerra !

HuggingFaceDocBuilderDev · 2022-08-18T19:59:08Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

lvwerra

Thanks for adding this! The PR is in pretty good shape already! Mostly added some comments about efficiently loading the pipeline.

Just looking at the functionality this seems to me also a case where it is not so clear why this shouldn't be a measurement (essentially you look at text files and it doesn't matter so much whether they are generated or human written).

metrics/toxicity/toxicity.py

metrics/toxicity/requirements.txt

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

metrics/toxicity/toxicity.py

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

removing print

measurements/toxicity/README.md

measurements/toxicity/toxicity.py

mathemakitten · 2022-08-19T18:37:57Z

measurements/toxicity/toxicity.py

+
+Args:
+    `predictions` (list of str): prediction/candidate sentences
+    `toxic_label` (optional): the toxic label that you want to detect, depending on the labels that the model has been trained on.


Add that type is str. Should we specify that right now we only allow for one label here? Toxicity is often a multi-class prediction problem wrt toxicity along several axes (e.g. identity-based hate vs. racism) but right now we only handle one class.

and how would you aggregate the results across different labels?
e.g. if you have
{'offensive': 0.65, 'hate': 0.98}, then what?

Ah yes, aggregation would be a bit tricky. I think the Perspective API (as an example) reports these results back unaggregated across categories — there's an individual score for each category of 'identity hate', 'toxicity', 'sexism', 'racism', 'sexually explicit' etc. and they don't aggregate across categories.

I assume the idea is that as an end user of a toxicity API you'd want to handle cases of sexually explicit content differently than identity-based hate, so the granularity is helpful/necessary. In this case an equivalent process would be to not aggregate when there are several types of toxicity specified, and report back per-toxicity-class (e.g. "toxic_labels" is a list instead of a str). What do you think?

For now I can only find binary hate speech classification models on the Hub, so maybe we keep it like this for now?

measurements/toxicity/toxicity.py

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

updating examples

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

lvwerra

Looks good, left mostly nits! I think there is an issue with the docstring based on the CI error. Also if you merge main into your branch the timeout issue of the CI should not be there anymore.

measurements/toxicity/toxicity.py

lvwerra · 2022-08-23T12:56:22Z

From the CI:

Toxicity has inconsistent leading whitespace: '    `aggregation` (optional): determines the type of aggregation performed on the data. If set to `None`, the scores for each prediction are returned.'

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

lvwerra

Thanks for adding this - added a few final remarks suggestions. Then it's good to go :)

measurements/toxicity/README.md

measurements/toxicity/toxicity.py

lvwerra · 2022-08-24T13:24:53Z

measurements/toxicity/toxicity.py

+            codebase_urls=[],
+            reference_urls=[],


no references, code on github we can reference here?

Not really, there is just the dataset that the toxicity model was trained on? https://github.com/bvidgen/Dynamically-Generated-Hate-Speech-Dataset

Not sure if that's helpful

Is there a reason we wouldn't want to link to the RealToxicityPrompts repo? The classifier is different (Perspective vs. FAIR classifier) but it's the same idea, and RealToxicityPrompts is a canonical citation for the toxicity metric in the past few years.

https://github.com/allenai/real-toxicity-prompts/

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

sashavor · 2022-08-24T16:30:24Z

I'm not sure it's super useful, since it's a general toxicity measure used by lots of other repos, not only real toxicity prompts (who also use a completely different model + approach)

…

On Wed, Aug 24, 2022 at 12:28 PM helen ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In measurements/toxicity/toxicity.py <#262 (comment)>: > + codebase_urls=[], + reference_urls=[], Is there a reason we wouldn't want to link to the repo corresponding to the dataset we use here? They use Perspective as opposed to the FAIR model for scoring, but the data and general idea is the same. https://github.com/allenai/real-toxicity-prompts — Reply to this email directly, view it on GitHub <#262 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADMMIITX3DV7KPUTC6UESO3V2ZERZANCNFSM566LTUCA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- Sasha Luccioni, PhD Postdoctoral Researcher (Mila, Université de Montréal) Chercheure postdoctorale (Mila, Université de Montréal) https://www.sashaluccioni.com/ [image: Image result for universite de montreal logo]

sashavor added 2 commits August 18, 2022 15:48

initial toxicity metric draft

9cc1aa4

updating README

befd50f

sashavor requested a review from lvwerra August 18, 2022 19:53

update requirements

570209d

lvwerra reviewed Aug 19, 2022

View reviewed changes

metrics/toxicity/toxicity.py Outdated Show resolved Hide resolved

metrics/toxicity/requirements.txt Outdated Show resolved Hide resolved

Sasha Luccioni and others added 3 commits August 19, 2022 08:24

Update metrics/toxicity/toxicity.py

7193d4a

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update metrics/toxicity/requirements.txt

1359820

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

making changes to model loading

bd50f54

lvwerra reviewed Aug 19, 2022

View reviewed changes

metrics/toxicity/toxicity.py Outdated Show resolved Hide resolved

Sasha Luccioni and others added 3 commits August 19, 2022 09:14

Update metrics/toxicity/toxicity.py

e9e0a6c

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

changing metric to measurement, changing loading behavior

69ff8ac

changing loading behavior

37f61e9

lvwerra requested a review from mathemakitten August 19, 2022 14:22

trying to fix circle CI error

c7fc829

sashavor changed the title ~~Toxicity~~ Toxicity Measurement Aug 19, 2022

sashavor and others added 3 commits August 19, 2022 11:20

removing offending line

8e79fd3

removing comment

6bc4561

Update toxicity.py

84e4c7f

removing print

mathemakitten reviewed Aug 19, 2022

View reviewed changes

measurements/toxicity/toxicity.py Outdated Show resolved Hide resolved

Sasha Luccioni and others added 9 commits August 22, 2022 10:00

Update measurements/toxicity/toxicity.py

30e53e4

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

Update measurements/toxicity/README.md

0248392

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

Update measurements/toxicity/README.md

bb8f8ed

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

Update measurements/toxicity/README.md

483d564

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

Update README.md

752b051

updating examples

Update measurements/toxicity/README.md

1750ff9

Co-authored-by: helen <31600291+mathemakitten@users.noreply.github.com>

adding custom thresholding

ffafa57

adding toxic_label check

e788e87

running make

6f1af31

sashavor added 2 commits August 22, 2022 15:33

trying to fix circle CI error

757922e

trying to fix circle CI error

6df4d70

lvwerra reviewed Aug 23, 2022

View reviewed changes

sashavor and others added 12 commits August 23, 2022 08:58

Merge branch 'main' into toxicity

2adbbf1

fixing whitespace

4ffc509

Update measurements/toxicity/toxicity.py

5b24442

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/toxicity.py

6b85335

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

making changes Leandro suggested

36ad557

Update measurements/toxicity/toxicity.py

8b4a761

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

fixing loading

cf4a3cf

fixing whitespace

c4193e4

fixing docstring

5ececf3

rounding results

06cb039

fixing examples

aca2b3c

whitespace

20d4147

lvwerra approved these changes Aug 24, 2022

View reviewed changes

Sasha Luccioni and others added 6 commits August 24, 2022 11:59

Update measurements/toxicity/README.md

5cdb945

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/README.md

b1ace10

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/README.md

b83d262

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/README.md

f14732c

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/README.md

f59fcc2

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

Update measurements/toxicity/toxicity.py

188f714

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

sashavor merged commit e78e709 into main Aug 24, 2022

sashavor deleted the toxicity branch August 24, 2022 16:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Toxicity Measurement #262

Toxicity Measurement #262

sashavor commented Aug 18, 2022

HuggingFaceDocBuilderDev commented Aug 18, 2022

lvwerra left a comment

mathemakitten Aug 19, 2022

sashavor Aug 22, 2022

mathemakitten Aug 22, 2022

sashavor Aug 22, 2022

lvwerra left a comment

lvwerra commented Aug 23, 2022

lvwerra left a comment

lvwerra Aug 24, 2022

sashavor Aug 24, 2022

mathemakitten Aug 24, 2022

sashavor commented Aug 24, 2022 via email

Toxicity Measurement #262

Toxicity Measurement #262

Conversation

sashavor commented Aug 18, 2022

HuggingFaceDocBuilderDev commented Aug 18, 2022

lvwerra left a comment

Choose a reason for hiding this comment

mathemakitten Aug 19, 2022

Choose a reason for hiding this comment

sashavor Aug 22, 2022

Choose a reason for hiding this comment

mathemakitten Aug 22, 2022

Choose a reason for hiding this comment

sashavor Aug 22, 2022

Choose a reason for hiding this comment

lvwerra left a comment

Choose a reason for hiding this comment

lvwerra commented Aug 23, 2022

lvwerra left a comment

Choose a reason for hiding this comment

lvwerra Aug 24, 2022

Choose a reason for hiding this comment

sashavor Aug 24, 2022

Choose a reason for hiding this comment

mathemakitten Aug 24, 2022

Choose a reason for hiding this comment

sashavor commented Aug 24, 2022 via email