Skip to content

Commit

Permalink
[DOCS] Fix classification score details in example (#1368)
Browse files Browse the repository at this point in the history
  • Loading branch information
lcawl committed Sep 22, 2020
1 parent 99210ac commit ca66ab0
Showing 1 changed file with 14 additions and 17 deletions.
31 changes: 14 additions & 17 deletions docs/en/stack/ml/df-analytics/flightdata-classification.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -316,14 +316,14 @@ or testing data set. You can filter the table and the confusion matrix such that
they contain only testing or training data. You can also enable histogram charts
to get a better understanding of the distribution of values in your data.

If you examine this destination index more closely in the *Discover* app in
{kib} or use the standard {es} search command, you can see that the analysis
predicts the probability of all possible classes for the dependent variable (in
a `top_classes` object). In this case, there are two classes: `true` and
`false`. The most probable class is the prediction, which is what's shown in the
{classification} results table. If you want to understand how sure the model is
about the prediction, however, you might want to examine the class probability
values. A higher number means that the model is more confident.
If you want to understand how certain the model is about each prediction, you
can examine its probability and score (`ml.prediction_probability` and
`ml.prediction_score`). The higher these values are, the more confident the
model is that the data point belongs to the named class. If you examine the
destination index more closely in the *Discover* app in {kib} or use the
standard {es} search command, you can see that the analysis predicts the
probability of all possible classes for the dependent variable. The
`top_classes` object contains the predicted classes with the highest scores.

.API example
[%collapsible]
Expand All @@ -334,7 +334,6 @@ GET df-flight-delayed/_search
--------------------------------------------------
// TEST[skip:TBD]
The snippet below shows a part of a document with the annotated results:
[source,console-result]
Expand Down Expand Up @@ -372,14 +371,12 @@ The snippet below shows a part of a document with the annotated results:
}
----
<1> An array of values specifying the probability of the prediction and the
`class_score` for each class.
The `top_classes` object contains the predicted classes with the highest
scores. The `class_probability` is a value between 0 and 1. The higher the
number, the more confident the model is that the data point belongs to the named
class. In the example above, `false` has a `class_probability` of 0.91 while
`true` has only 0.08, so the prediction will be `false`. The `class_score` is a
function of the probability.
score for each class.
The class with the highest score is the prediction. In this example, `false` has
a `class_score` of 0.37 while `true` has only 0.08, so the prediction will be
`false`. For more details about these values, see
<<dfa-classification-interpret>>.
////
It is chosen so that the decision to assign the
Expand Down

0 comments on commit ca66ab0

Please sign in to comment.