Google Research Datasets

sanpo_dataset Public

google-research-datasets/sanpo_dataset’s past year of commit activity

Python 39 Apache-2.0 1 3 1 Updated Aug 2, 2024
SPICE Public
SPICE is a stereotype dataset in English containing stereotypes collected in India with community engagement. It spans identity groups and stereotypes unique to India, as well as other stereotypes about gender and nationalities.

google-research-datasets/SPICE’s past year of commit activity

2 CC-BY-4.0 0 0 0 Updated Jul 26, 2024
cube Public
CUBE is a benchmark to evaluate the Cultural Competence of T2I models

google-research-datasets/cube’s past year of commit activity

4 CC-BY-4.0 0 0 0 Updated Jul 18, 2024
screen_qa Public
ScreenQA dataset was introduced in the "ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots" paper. It contains ~86K question-answer pairs collected by human annotators for ~35K screenshots from Rico. It should be used to train and evaluate models capable of screen content understanding via question answering.

google-research-datasets/screen_qa’s past year of commit activity

78 CC-BY-4.0 8 1 0 Updated Jul 18, 2024
uicrit Public
UICrit is a dataset containing human-generated natural language design critiques, corresponding bounding boxes for each critique, and design quality ratings for 1,000 mobile UIs from RICO. This dataset was collected for our UIST '24 paper: https://arxiv.org/abs/2407.08850.

google-research-datasets/uicrit’s past year of commit activity

1 0 0 0 Updated Jul 18, 2024
visage Public
Visage contains an image dataset of images with human annotations on whether or not certain attributes are present or depicted in the image. The attribute may either be stereotypical or non-stereotypical w.r.t. to the identity group in the image. It also contains a list of attributes in English along with annotations about whether they are visual.

google-research-datasets/visage’s past year of commit activity

6 Apache-2.0 1 0 0 Updated Jul 16, 2024
dices-dataset Public
This repository contains two datasets with multi-turn adversarial conversations generated by human agents interacting with a dialog model and rated for safety by two corresponding diverse rater pools.

google-research-datasets/dices-dataset’s past year of commit activity

23 2 1 0 Updated Jul 16, 2024
wit Public
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages.

google-research-datasets/wit’s past year of commit activity

982 40 0 0 Updated Jul 12, 2024
rico_semantics Public
Consists of ~500k human annotations on the RICO dataset identifying various icons based on their shapes and semantics, and associations between selected general UI elements and their text labels. Annotations also include human annotated bounding boxes which are more accurate and have a greater coverage of UI elements.

google-research-datasets/rico_semantics’s past year of commit activity

19 CC-BY-SA-4.0 2 1 0 Updated Jun 27, 2024
tpu_graphs Public

google-research-datasets/tpu_graphs’s past year of commit activity

C++ 119 Apache-2.0 43 2 1 Updated Jun 25, 2024

View all repositories

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Google Research Datasets

Pinned Loading

Repositories

People

Top languages

Most used topics