Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report timing stats as part of the Job stats response #42709

Merged

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented May 30, 2019

Initial version of timing stats reporting.
TimingStats object returned as part of the GetJobStats response contains 3 numbers: min, max and avg of bucket processing times.
Additionally, it contains bucket count. This bucket count may differ from the bucket count stored in DataCounts.

Related issue: #29857

@przemekwitek przemekwitek added the :ml Machine learning label May 30, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/bwc

@przemekwitek przemekwitek force-pushed the anomaly_detector_timing_stats branch 5 times, most recently from acf1415 to ca04d84 Compare June 5, 2019 08:47
@przemekwitek przemekwitek marked this pull request as ready for review June 5, 2019 08:51
@droberts195
Copy link
Contributor

I think you need to add a method ElasticsearchMappings.addTimingStatsMapping() that's similar to ElasticsearchMappings.addDataCountsMapping() and used in a similar way.

Additionally, the new field names should go in ReservedFieldNames.RESERVED_RESULT_FIELD_NAME_ARRAY.

Finally, are you still planning to add another field for some sort of exponential moving average in addition to the average over all time?

@@ -273,11 +286,13 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeOptionalString(assignmentExplanation);
out.writeOptionalTimeValue(openTime);
out.writeOptionalWriteable(forecastStats);
out.writeOptionalWriteable(timingStats);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be version-aware so it works in a mixed version cluster:

    if (out.getVersion().onOrAfter(V_7_3_0)) {
        out.writeOptionalWriteable(timingStats);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException {
assignmentExplanation = in.readOptionalString();
openTime = in.readOptionalTimeValue();
forecastStats = in.readOptionalWriteable(ForecastStats::new);
timingStats = in.readOptionalWriteable(TimingStats::new);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be version-aware so it works in a mixed version cluster:

    if (in.getVersion().onOrAfter(V_7_3_0)) {
        timingStats = in.readOptionalWriteable(TimingStats::new);
    } else {
        timingStats = null;
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0) immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195: Done.
@benwtrent: Any hints on how to add BWC for job stats?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@przemekwitek you should be able to add them to the BWC YAML tests. These are in xpack.qa.rolling-upgrade.resources. Probably cases added in 30_ml_jobs_crud.yml

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed BWC tests with @dimitris-athanasiou and he suggested that it will be easier if I add them (in a separate PR) after merging in this PR into master and backporting into 7.x. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it will be much easier to add the BWC tests in a follow-up PR. Otherwise you'd have to add them on the basis that 7.x doesn't contain this functionality, then adjust both master and 7.x once 7.x does contain this functionality. So I agree with the idea of adding them in a separate PR.

@benwtrent benwtrent self-requested a review June 6, 2019 12:09
@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException {
assignmentExplanation = in.readOptionalString();
openTime = in.readOptionalTimeValue();
forecastStats = in.readOptionalWriteable(ForecastStats::new);
timingStats = in.readOptionalWriteable(TimingStats::new);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@droberts195 is right. Additionally, making it onOrAfter(V_7_3_0) immediately could break the bwc tests as the actual v7.3.0 code has not been backported to yet.

@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(jobId);
out.writeOptionalLong(bucketCount);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucketCount is not optional. If it wrote a null there would be errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

* Returns true if given stats objects differ from each other by more than 10% for at least one of the statistics.
*/
public static boolean differSignificantly(TimingStats stats1, TimingStats stats2) {
return differSignificantly(stats1.minBucketProcessingTimeMs, stats2.minBucketProcessingTimeMs, 0.1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make 0.1 a private constant so that it is self-documenting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
I've also removed the "p" parameter as it is not used in any meaningful way at the moment.


public TimingStats(StreamInput in) throws IOException {
this.jobId = in.readString();
this.bucketCount = in.readOptionalLong();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bucketCount is not optional

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -65,8 +67,12 @@ protected Response createTestInstance() {
if (randomBoolean()) {
openTime = parseTimeValue(randomPositiveTimeValue(), "open_time-Test");
}
Response.JobStats jobStats = new Response.JobStats(jobId, dataCounts, sizeStats, forecastStats, jobState, node, explanation,
openTime);
TimingStats timingStats = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#nit

Suggested change
TimingStats timingStats = null;
TimingStats timingStats = randomBoolean() ? null : TimingStatsTests.createTestInstance("foo");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (refactored the whole method this way).

/**
* Updates the statistics (min, max, avg) for the given data point (bucket processing time).
*/
public void updateStats(double bucketProcessingTimeMs) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we disallow negatives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -124,6 +135,11 @@ public Builder setModelSizeStats(ModelSizeStats modelSizeStats) {
return this;
}

public Builder setTimingStats(TimingStats timingStats) {
this.timingStats = timingStats;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.timingStats = timingStats;
this.timingStats = new TimingStats(timingStats);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate on why do you think this change is needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@przemekwitek TimingStats has a method that allows its internal state to mutate. Not doing a copy would allow the parameter timingStats#updateStats to be called AFTER it is passed to this method causing the Builder#timingStats to be updated as well.

https://repl.it/@ben_w_trent/mutateState

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that. It's just this setter in builder is only called once and the temporary object is not saved anywhere else:
paramsBuilder.setTimingStats(parseSearchHit(hit, TimingStats.PARSER, errorHandler));

But ok, I applied your suggestion to prevent some random refactoring of breaking this assumption.

@@ -185,6 +190,7 @@ public JobStats(StreamInput in) throws IOException {
assignmentExplanation = in.readOptionalString();
openTime = in.readOptionalTimeValue();
forecastStats = in.readOptionalWriteable(ForecastStats::new);
timingStats = in.readOptionalWriteable(TimingStats::new);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, since this is the first BWC issue JobStats serialization, BWC should be added for job stats.

@przemekwitek
Copy link
Contributor Author

I think you need to add a method ElasticsearchMappings.addTimingStatsMapping() that's similar to ElasticsearchMappings.addDataCountsMapping() and used in a similar way.

Done.

Additionally, the new field names should go in ReservedFieldNames.RESERVED_RESULT_FIELD_NAME_ARRAY.

Done.

Finally, are you still planning to add another field for some sort of exponential moving average in addition to the average over all time?

Yes, but I'd rather do it in a separate PR as I wanted to check in and verify the base work first.

@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/2
run elasticsearch-ci/bwc
run elasticsearch-ci/default-distro

@przemekwitek przemekwitek force-pushed the anomaly_detector_timing_stats branch from 3ab9022 to dc385ac Compare June 11, 2019 07:14
Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants