Skip to content

Commit

Permalink
Reference LABELING.md in README.md and move all labeling information …
Browse files Browse the repository at this point in the history
…in LABELING.md
  • Loading branch information
sagarvijaygupta committed Jun 23, 2018
1 parent 8121c9d commit 51b5856
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 32 deletions.
7 changes: 7 additions & 0 deletions LABELING.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@

# Labeling Guidelines

Now that the screenshots are available, they need to be labeled. The labeling phase operates on couples of comparable screenshots.

## Images marked as compatible - y
---
#### Couples of images that are clearly compatible.
#### They look the same.
#### firefox\_chrome\_overlay window should nearly overlap them.
---
Expand All @@ -14,13 +17,15 @@

## Bounding boxes marked as incompatible - n
---
#### Couples of images which are not compatible
#### They are different.
#### Mark the parts which are logically different.
> Improper loading of images, missing text, different design, different languages are marked incompatible.
---

## Bounding boxes marked as different yet compatible - d
---
#### Couples of images that are compatible, but with content differences.
#### They look different.
#### Mark the parts which are logically the same.
>Different advertisements, different videos loaded, time-in-clock are marked
Expand All @@ -42,3 +47,5 @@ as different yet compatible.
<p align="center"><img src="labeling_guide/n14.png" width=617 height=357></p>
<p align="center"><img src="labeling_guide/d1.png" width=617 height=357></p>


In the training phase, the best case is that we are able to detect between **Y + D and N**. If we are not able to do that, we should at least aim for the relaxed problem of detecting between **Y and D + N**. This is why we have this three labeling system.
32 changes: 0 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,38 +17,6 @@ The `data/` directory contains the screenshots generated by the crawler (N.B.: T

### Labeling
[labeling guide](LABELING.md)
Now that the screenshots are available, they need to be labeled. The labeling phase operates on couples of comparable screenshots.

There are three possible labels:
1. **Y** for couples of images that are clearly compatible;
2. **D** for couples of images that are compatible, but with content differences (e.g. on a news site, two screenshots could be compatible even though they are showing two different news, simply because the news shown depends on the time the screenshot was taken and not on the fact that the browser is different);
3. **N** for couples of images which are not compatible.

Here are some examples of the three labels:

**Y**
<img src="https://user-images.githubusercontent.com/1616846/35619755-4a932132-067f-11e8-8b1c-c2f70a6819f4.png" width=158 /> <img src="https://user-images.githubusercontent.com/1616846/35619749-458ac7b2-067f-11e8-868d-ac6e186dec98.png" width=158 />

**D**
<img src="https://user-images.githubusercontent.com/1616846/35619779-5d39f90a-067f-11e8-9e31-7c793c79f246.png" width=158 /> <img src="https://user-images.githubusercontent.com/1616846/35619800-6f25ff2e-067f-11e8-8792-f1c3d9c875d1.png" width=158 />

**N**
<img src="https://user-images.githubusercontent.com/1616846/35619822-7f65ed22-067f-11e8-9b2b-ea99cfd6f7de.png" width=158 /> <img src="https://user-images.githubusercontent.com/1616846/35619769-5724cafe-067f-11e8-8e6a-00d527ab3581.png" width=158 />

In the training phase, the best case is that we are able to detect between Y+D and N. If we are not able to do that, we should at least aim for the relaxed problem of detecting between Y and D+N. This is why we have this three labeling system.

The labeling technical details are described [in this issue](https://github.com/marco-c/autowebcompat/issues/2).

The bounding-box labeling allows us to store the areas where the incompatibilities lie.

<img src="https://user-images.githubusercontent.com/18056781/39081659-fdd4655e-4562-11e8-86f9-a5fab28634bf.JPG" />

<img src="https://user-images.githubusercontent.com/18056781/39081665-10eda006-4563-11e8-9455-986b5a23934e.jpg" />

- Press 'y' to mark the images as compatible;
- Press 'Enter' to select the regions;
- Click the 'T' button in the top left corner of a boundary box to toggle between classes. Green corresponds to 'n', yellow corresponds to 'd';
- Press 'Enter' to save changes.

### Training

Expand Down

0 comments on commit 51b5856

Please sign in to comment.