Addressing label disagreement workflow #275

ivanzvonkov · 2023-02-14T03:29:18Z

Context:
The first step of creating a crop land map involves generating a validation and test set.
To do this two Collect Earth Online sets are created following this notebook
The two sets contain identical data points and are labeled by different labelers.
When labeling is complete, a single dataset is created by combining data points from both sets. The label: crop/non-crop is represented as a float: 1.0/0.0.
i) If both labelers label the data point as crop (1.0), the final label is crop (1.0).
ii) If both labelers label the data point as non-crop (0.0), the final label is non-crop (0.0).
iii) If one labeler labels the data point as crop (1.0) and another as non-crop (0.0) the final label is 0.5.
An example of this processing is shown here:

crop-mask/datasets.py

Line 82 in 0192198

df = df.groupby([LON, LAT], as_index=False, sort=False).agg(

The 0.5 labeled data points represent disagreement points, where labelers disagreed and this is reported in data/report.txt. E.g.

crop-mask/data/report.txt

Line 38 in 0192198

disagreement: 12.1%

Because the disagreement points are neither crop or non-crop they are ignored by the model when loading data. See

crop-mask/src/models/model.py

Line 203 in 0192198

dfs.append(df[(df[SUBSET] == subset) & (df[CLASS_PROB] != 0.5)])

Problem:
Disagreement points represent difficult points to label and should not be ignored.

Potential Solution:
There should be a workflow implemented to flag datasets with high disagreement and address those disagreements. This can potentially be done by returning to the original Collect Earth Online labeling projects and engaging experts to resolve disagreements. Documentation and automation of this workflow is important.

The text was updated successfully, but these errors were encountered:

bhyeh · 2023-02-23T19:27:50Z

@ivanzvonkov - looking at patterns of disagreements may clue us how to more systematically resolve them, but what kind of patterns might be useful? Brainstorming some thoughts here....

Certain labelers involved in disagreements
a) Labelers who are now known to have been labeling incorrectly $\rightarrow$ Ignore their label in favor of the opposing label
Analysis duration
a) Short analysis times may be indicative of incorrectness, whereas longer duration times may be indicating more ambiguous points
b) Above ^ can be further investigated with area estimation labeling projects (where a 'final' consensus label set is available) and see whether for disagreeing points if labels w/ the 'shorter' analysis durations were more often incorrect.
Label distributions
a) ?

hannah-rae · 2023-02-23T22:14:01Z

Re: 1a, this could also identify individuals who might need additional training/advice from an expert.

Re: 3, it could be interesting to know if there is more disagreement among crop points or non-crop points, but we don't know that unless we have a tie-breaker/correction. So that may be something to investigate with the crop area notebooks where we do have the tie broken by group/expert review.

ivanzvonkov · 2023-02-24T22:48:31Z

Re: 1a) this is useful. I envision the implementation to be a separate notebook that reads in all CEO files we have and does analysis over all sets at once to see legitimate trends

Re 2 I think this direction makes sense, and your initial work in the PR is a good first step.

hannah-rae · 2024-02-29T15:23:50Z

Flagging because Ben did lots of work on this and I need to review where that ended to see what we have left to do / whether this has been addressed and we should propagate our findings to other projects. @ivanzvonkov maybe we can revisit this before the new labeler students start.

Also this paper reminded me of this task because they address disagreement between labelers and how to resolve it a little bit: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2021EA002085

ivanzvonkov added the data-pipeline label Feb 14, 2023

ivanzvonkov assigned bhyeh Feb 14, 2023

bhyeh mentioned this issue Feb 20, 2023

Add notebook for CEO meta analysis #276

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addressing label disagreement workflow #275

Addressing label disagreement workflow #275

ivanzvonkov commented Feb 14, 2023

bhyeh commented Feb 23, 2023

hannah-rae commented Feb 23, 2023

ivanzvonkov commented Feb 24, 2023

hannah-rae commented Feb 29, 2024 •

edited

Loading

Addressing label disagreement workflow #275

Addressing label disagreement workflow #275

Comments

ivanzvonkov commented Feb 14, 2023

bhyeh commented Feb 23, 2023

hannah-rae commented Feb 23, 2023

ivanzvonkov commented Feb 24, 2023

hannah-rae commented Feb 29, 2024 • edited Loading

hannah-rae commented Feb 29, 2024 •

edited

Loading