ROC curves for multi-class (What-If Tool) #2755

grovina · 2019-10-10T14:02:49Z

Motivation for features / changes

Show ROC curves for multi-class models.

Technical description of changes

For each class, we can plot a ROC curve considering the class in question as the positive one and all the others as negatives (what is called a binarized version of the problem).
To achieve this, we iterate through all examples, populating an object with the following structure:
> model > slice > label > threshold > classification stats
Where the classification data contain the number and rates of true and false positives and negatives. This is done for every value of the threshold (every 1% between 0 to 100%), for every class label, for each feature slice (either each value for categorical features or each interval for numeric features), for every model.
The overall case (not sliced by any feature) is trivially treated as a single slice.

When displaying the ROC curves, we end up with the same case as the ordinary ROC curves from binary classification problems, once for each slice.

Screenshots of UI changes

The overall case:

The sliced feature case:

Detailed steps to verify changes work correctly (as executed by you)
Open performance tab in the iris demo, and check that:
- the correct number of ROC curves appear (same number as number of classes in the problem)
- the values and shape of the ROC curve are coherent
- the curve respects the data

Repeat for sliced features, and also for different numbers of buckets

Alternate designs / implementations considered

We could think of averaged ROC curves (like https://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html). I personally think they are a bit harder to interpret.

Another possibility is merging all curves into a single plot. Although more compact, it wouldn't be as clear that each curve refers to a distinct version of the problem, and it could also get a bit too crowded with many classes or multiple models.

By considering one binarized problem for each class (the class vs. all the rest), we build one ROC curve for each class. This was done for the sliced case first so that the grouped case could be treated as a simple extention of this one. The data is stored for each model in `inferenceStats_`, in the key `allThresholds`.

Extending previous sliced implementation to grouped case by simply considering it as a single slice with an empty string key.

grovina · 2019-10-10T14:03:49Z

cc @jameswex and @tolga-b

jameswex

i think the plots need some right padding (4 or 8px, and maybe then you can remove some left padding) otherwise the right-most one can hug the right edge of WIT with no spacing.

jameswex · 2019-10-11T13:41:12Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

+                  plotThresholds,
+                  regenInferenceStats,
+                  true
+                );


should really do ROC and PR curves for each class just like with binary

ok! That should be trivial now ;)

jameswex · 2019-10-11T13:44:31Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

@@ -4854,7 +5015,7 @@ <h2>Show similarity to selected datapoint</h2>
          this.featureValueThresholds = [];
          this.featureValueThresholds = this.sortFeatureValues(tempArray);

-          // ROC curves should only exist for the binary case
+          // ROC curves for the binary case


update comment since also does PR curves

👍 also renamed a couple of functions that only made reference to PR.

jameswex · 2019-10-11T13:46:31Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

+          return this.getRocChartId(index) + '-' + label;
+        },
+
+        getLabelVocab: function(index) {


call this getLabel or getLabelForIndex for clarity

Done. Picked getLabel to be consistent with the other methods like getRocChartId, getPrChartId.

Just improving clarity.

Analogous to the binary case, using the same data and structures as the ROC curves.

These methods are being used to determine whether to display ROC and PR curves, but only make reference to PR curves. Renaming them to something more generic that can take both (and possibly others) into account.

jameswex · 2019-10-11T15:37:46Z

thanks @grovina , just the one comment about padding now

jameswex · 2019-10-11T15:42:57Z

also make sure to run lint

grovina

Working on the padding...

grovina · 2019-10-11T13:56:59Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

+          return this.getRocChartId(index) + '-' + label;
+        },
+
+        getLabelVocab: function(index) {


Done. Picked getLabel to be consistent with the other methods like getRocChartId, getPrChartId.

grovina · 2019-10-11T13:59:45Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

@@ -4854,7 +5015,7 @@ <h2>Show similarity to selected datapoint</h2>
          this.featureValueThresholds = [];
          this.featureValueThresholds = this.sortFeatureValues(tempArray);

-          // ROC curves should only exist for the binary case
+          // ROC curves for the binary case


👍 also renamed a couple of functions that only made reference to PR.

grovina · 2019-10-11T14:16:47Z

...ractive_inference/tf_interactive_inference_dashboard/tf-interactive-inference-dashboard.html

+                  plotThresholds,
+                  regenInferenceStats,
+                  true
+                );


ok! That should be trivial now ;)

For multi-class models, let's put the curves for each class in a separated line, so that it becomes easier to follow. For both binary and multi-class models, we adjust the margins, position of axis labels and centralize the plots. I also renamed and cleaned up some CSS classes on the way.

grovina · 2019-10-11T16:38:53Z

Adjusted some CSS and fixed lint. Here screenshots for:

the multi-class case:
the binary case:

This was added by mistake in tensorflow#2755.

grovina added 3 commits October 10, 2019 10:51

Add multi-class ROC to non-sliced case

efed80f

Extending previous sliced implementation to grouped case by simply considering it as a single slice with an empty string key.

Adjust ROC curve y axis position

1f943fc

googlebot added the cla: yes label Oct 10, 2019

Fix lint

a1461cc

jameswex reviewed Oct 11, 2019

View reviewed changes

grovina added 3 commits October 11, 2019 14:59

Rename method that gets a class label

8e05b8d

Just improving clarity.

Add PR curves to multi-class models

d83509e

Analogous to the binary case, using the same data and structures as the ROC curves.

Make method names consider ROC and PR curves

591ef8a

These methods are being used to determine whether to display ROC and PR curves, but only make reference to PR curves. Renaming them to something more generic that can take both (and possibly others) into account.

grovina commented Oct 11, 2019

View reviewed changes

jameswex approved these changes Oct 11, 2019

View reviewed changes

jameswex merged commit 2de6882 into tensorflow:master Oct 11, 2019

grovina added a commit to grovina/tensorboard that referenced this pull request Oct 11, 2019

Fix tabs left margin

9399c09

This was added by mistake in tensorflow#2755.

grovina mentioned this pull request Oct 11, 2019

Fix tabs left margin (What-If Tool) #2763

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ROC curves for multi-class (What-If Tool) #2755

ROC curves for multi-class (What-If Tool) #2755

grovina commented Oct 10, 2019

grovina commented Oct 10, 2019

jameswex left a comment

jameswex Oct 11, 2019

grovina Oct 11, 2019

jameswex Oct 11, 2019

grovina Oct 11, 2019

jameswex Oct 11, 2019

grovina Oct 11, 2019

jameswex commented Oct 11, 2019

jameswex commented Oct 11, 2019

grovina left a comment

grovina Oct 11, 2019

grovina Oct 11, 2019

grovina Oct 11, 2019

grovina commented Oct 11, 2019

ROC curves for multi-class (What-If Tool) #2755

ROC curves for multi-class (What-If Tool) #2755

Conversation

grovina commented Oct 10, 2019

grovina commented Oct 10, 2019

jameswex left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jameswex commented Oct 11, 2019

jameswex commented Oct 11, 2019

grovina left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

grovina commented Oct 11, 2019