[DOCS] Add feature importance to classification example

elastic · Sep 30, 2020 · cac81f1 · cac81f1
1 parent 715c3ee
commit cac81f1
Show file tree

Hide file tree

Showing 4 changed files with 35 additions and 6 deletions.
diff --git a/docs/en/stack/ml/df-analytics/dfa-classification.asciidoc b/docs/en/stack/ml/df-analytics/dfa-classification.asciidoc
@@ -196,4 +196,4 @@ testing. This split of the data set is the _testing data set_. Once the model ha
 been trained, you can let the model predict the value of the data points it has 
 never seen before and compare the prediction to the actual value by using the 
 evaluate {dfanalytics} API.
-////
+////
diff --git a/docs/en/stack/ml/df-analytics/flightdata-classification.asciidoc b/docs/en/stack/ml/df-analytics/flightdata-classification.asciidoc
@@ -123,7 +123,7 @@ large data sets using a small training sample greatly reduces runtime without
 impacting accuracy.
 .. If you want to experiment with <<ml-feature-importance,{feat-imp}>>, specify
 a value in the advanced configuration options. In this example, we choose to
-return a maximum of 10 feature importance values per document. This option
+return a maximum of 10 {feat-imp} values per document. This option
 affects the speed of the analysis, so by default it is disabled. 
 .. Use the default memory limit for the job. If the job requires more than this 
 amount of memory, it fails to start. If the available memory on the node is
@@ -170,7 +170,7 @@ PUT _ml/data_frame/analytics/model-flight-delay-classification
 --------------------------------------------------
 // TEST[skip:setup kibana sample data]
 <1> The field name in the `dest` index that contains the analysis results.
-<2> To disable feature importance calculations, omit this option. 
+<2> To disable {feat-imp} calculations, omit this option. 
 ====
 --
 
@@ -331,7 +331,7 @@ can examine its probability and score (`ml.prediction_probability` and
 model is that the data point belongs to the named class. If you examine the
 destination index more closely in the *Discover* app in {kib} or use the
 standard {es} search command, you can see that the analysis predicts the
-probability of all possible classes for the dependent variable. The 
+probability of all possible classes for the dependent variable. The
 `top_classes` object contains the predicted classes with the highest scores.
 
 .API example
@@ -417,6 +417,28 @@ summarized information in {kib}:
 [role="screenshot"]
 image::images/flights-classification-total-importance.png["Total {feat-imp} values in {kib}"]
 
+You can also see the {feat-imp} values for each individual prediction in the
+form of a decision plot:
+
+[role="screenshot"]
+image::images/flights-classification-importance.png["A decision plot for {feat-imp} values in {kib}"]
+////
+The sum of the {feat-imp} values for a class (in this example, `false`)
+in this data point approximates the logarithm of its odds
+(or {wikipedia}/Logit[log-odds]).
+
+While the probability of a class ranges between 0 and 1, its log-odds range 
+between negative and positive infinity. In {kib}, the decision path for each
+class starts at the average probability for that class over the training data
+set. From there, the {feat-imp} values are added to the decision path.
+The features with the most significant positive or negative impact appear at the
+top. Thus in this example, the features related to flight time and distance had
+the most significant influence on this prediction. This type of information can
+help you to understand how models arrive at their predictions. It can also
+indicate which aspects of your data set are most influential or least useful
+when you are training and tuning your model.
+////
+
 This type of information can help you to understand how models arrive at their
 predictions. It can also indicate which aspects of your data set are most
 influential or least useful when you are training and tuning your model.

diff --git a/docs/en/stack/ml/df-analytics/images/flights-classification-importance.png b/docs/en/stack/ml/df-analytics/images/flights-classification-importance.png
diff --git a/docs/en/stack/ml/df-analytics/ml-feature-importance.asciidoc b/docs/en/stack/ml/df-analytics/ml-feature-importance.asciidoc
@@ -44,7 +44,14 @@ data point to that baseline, you arrive at the numeric prediction value. If a
 {feat-imp} value is negative, it reduces the prediction value. If a {feat-imp}
 value is positive, it increases the prediction value.
 
-//TBD: Add section about classification analysis.
+////
+For {classanalysis}, the baseline is the average of the probability values for a
+specific class across all the data points in the training data set. When you add
+the feature importance values for a particular data point to that baseline, you
+arrive at the prediction probability for that class. If a {feat-imp} value is
+negative, it reduces the prediction probability. If a {feat-imp} value is
+positive, it increases the prediction probability.
+////
 
 By default, {feat-imp} values are not calculated. To generate this information,
 when you create a {dfanalytics-job} you must specify the
@@ -65,4 +72,4 @@ exPlanations) method as described in
 https://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf[Lundberg, S. M., & Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In NeurIPS 2017].
 
 See also
-https://www.elastic.co/blog/feature-importance-for-data-frame-analytics-with-elastic-machine-learning[{feat-imp-cap} for {dfanalytics} with Elastic {ml}].
+https://www.elastic.co/blog/feature-importance-for-data-frame-analytics-with-elastic-machine-learning[{feat-imp-cap} for {dfanalytics} with Elastic {ml}].