Towards Identifying Social Bias in Dialogue Systems

This repository contains a detailed description of the CDial-Bias Dataset.

This task aims to measure the social bias in dialogue scenarios. Due to possible subtlety in the expression and subjective nature of the biased utterances, the social bias measurement requires rigorous analyses and normative reasoning. Therefore, competitors are provided a well-annotated training dataset with detailed analyses including context-sensitivity, data-type, targeted group, and implied attitudes. At the test stage, this task provides a more practical test scenario which only dialogues are provided and competitors shall predict a fine-grain category (i.e. irrelevant, anti-bias, neutral, and biased) w.r.t. dialogue social bias.

Authors:

Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen M. Meng

Paper:

https://aclanthology.org/2022.findings-emnlp.262/

Dataset:

⚠️ Before downloading the dataset, please be aware that: The CDial-Bias Dataset is released for research purpose only and other usages require further permission. Please ensure the usage contributes to improving the safety and fairness of AI technologies. No malicious usage is allowed.

🤗 Dataset on Hugginface

Website:

🥇 Please check webpage for details on NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement, which provides the detailed information and leaderboard on this dataset.

Dataset

Format

The Cdial-Bias Dataset 2.0 has the following entries.

	Explaination
Q	Dialogue turn 1.
A	Dialogue turn 2.
Topic	The topic of the dialogue, including Race, Gender, Region, and Occupation.
Context Sensitivity	0 - Context-independent; 1 - Context-sensitive.
Data Type	0 - Irrelevant; 1 - Bias-expressing; 2 - Bias-discussing.
Bias Attitudes	0 - NA (Irrelevant data); 1 - Anti-Bias; 2 - Neutral; 3 - Biased.
Referrenced Groups	Presented in free text. Multiple groups are split by '/'.

Statistics

Topic	Context-Idependent/Sensitive	Irrelevant	Bias-expressing	Bias-discussing	Anti	Neutral	Biased	Group #
Race	6,451 / 4,420	4,725	2,772	3,374	155	3,115	2,876	70
Gender	5,093 / 3,291	3,895	1,441	3,048	78	2,631	1,780	40
Region	2,985 / 2,046	1,723	2,217	1,091	197	1,525	1,586	41
Occupation	2,842 / 1,215	2,006	1,231	820	24	1,036	991	20
Overall	17,371 / 10,972	12,349	7,659	8,333	454	8,307	7,233	-

The dataset is randomly shuffled and split into training, validation, and testing data in the ratio of 8:1:1.

Notes

If you want to publish experimental results with this dataset, please cite the following article:

@inproceedings{zhou-etal-2022-towards-identifying,
    title = "Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark",
    author = "Zhou, Jingyan  and Deng, Jiawen  and Mi, Fei  and Li, Yitong  and Wang, Yasheng  and Huang, Minlie  and Jiang, Xin  and  Liu, Qun  and Meng, Helen",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2022",
    month = dec,
    year = "2022",
    address = "Abu Dhabi, United Arab Emirates",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.findings-emnlp.262",
    doi = "10.18653/v1/2022.findings-emnlp.262"
}

Also, we held NLPCC 2022 Shared Task 7 based on the proposed resources. We got many talented participants contributing to the investigation of this problem, for more information, please check the webpage and the task overview here:

@inproceedings{zhou2022overview,
  title={Overview of NLPCC 2022 Shared Task 7: Fine-Grained Dialogue Social Bias Measurement},
  author={Zhou, Jingyan and Mi, Fei and Meng, Helen and Deng, Jiawen},
  booktitle={CCF International Conference on Natural Language Processing and Chinese Computing},
  pages={342--350},
  year={2022},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Identifying Social Bias in Dialogue Systems

Dataset

Format

Statistics

Notes

About

Releases

Packages

para-zhou/CDial-Bias

Folders and files

Latest commit

History

Repository files navigation

Towards Identifying Social Bias in Dialogue Systems

Dataset

Format

Statistics

Notes

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages