Skip to content
/ dstc Public

Dialog State Tracking Challenge 2 & 3 Data

License

Notifications You must be signed in to change notification settings

matthen/dstc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dialog State Tracking Challenge 2 & 3 Data

This data used to be hosted on http://camdial.org/~mh521/dstc/

Data downloads

View all under Releases.

Original challenge results:

Overview

The Dialog State Tracking Challenges 2 & 3 (DSTC2&3) were research challenge focused on improving the state of the art in tracking the state of spoken dialog systems. State tracking, sometimes called belief tracking, refers to accurately estimating the user's goal as a dialog progresses. Accurate state tracking is desirable because it provides robustness to errors in speech recognition, and helps reduce ambiguity inherent in language within a temporal process like dialog.

In these challenges, participants were given labelled corpora of dialogs to develop state tracking algorithms. The trackers were then evaluated on a common set of held-out dialogs, which were released, un-labelled, during a one week period.

The corpus was collected using Amazon Mechanical Turk, and consists of dialogs in two domains: restaurant information, and tourist information. Tourist information subsumes restaurant information, and includes bars, cafés etc. as well as multiple new slots. There were two rounds of evaluation using this data:

  • DSTC 2 released a large number of training dialogs related to restaurant search. Compared to DSTC (which was in the bus timetables domain), DSTC 2 introduces changing user goals, tracking 'requested slots' as well as the new restaurants domain. Results from DSTC 2 were presented at SIGDIAL 2014.
  • DSTC 3 addressed the problem of adapation to a new domain -- tourist information. DSTC 3 releases a small amount of labelled data in the tourist information domain; participants will use this data plus the restaurant data from DSTC 2 for training.

Dialogs used for training are fully labelled; user transcriptions, user dialog-act semantics and dialog state are all annotated. (This corpus therefore is also suitable for studies in Spoken Language Understanding.)

For more detailed information, please see the handbook.

Citations

@inproceedings{henderson2014second,
  title={The second dialog state tracking challenge},
  author={Henderson, Matthew and Thomson, Blaise and Williams, Jason D},
  booktitle={Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL)},
  pages={263--272},
  year={2014}
}

@inproceedings{henderson2014third,
  title={The third dialog state tracking challenge},
  author={Henderson, Matthew and Thomson, Blaise and Williams, Jason D},
  booktitle={2014 IEEE Spoken Language Technology Workshop (SLT)},
  pages={324--329},
  year={2014},
  organization={IEEE}
}

@article{williams2014dialog,
  title={The dialog state tracking challenge series},
  author={Williams, Jason D and Henderson, Matthew and Raux, Antoine and Thomson, Blaise and Black, Alan and Ramachandran, Deepak},
  journal={AI Magazine},
  volume={35},
  number={4},
  pages={121--124},
  year={2014}
}

@article{williams2016dialog,
  title={The dialog state tracking challenge series: A review},
  author={Williams, Jason and Raux, Antoine and Henderson, Matthew},
  journal={Dialogue \& Discourse},
  volume={7},
  number={3},
  pages={4--33},
  year={2016}
}