[ML] Validate existing cluster state differently to newly submitted configs #30084

elasticmachine · 2017-09-05T08:54:09Z

Original comment by @droberts195:

If we're going to introduce completely new job types in the future, we need to change the way unknown job/datafeed cluster state is validated.

While trying to add categorizer jobs, which are quite similar to anomaly_detector jobs, I ran into the following problem:

Logically, a categorizer job should have no detectors
But the AnalysisConfig class requires detectors
There are two possible solutions that seem reasonable at first glance:
1. Have categorizer jobs have a categorization_config instead of analysis_config
2. Change analysis_config so that detectors is not required if the job_type is categorizer
Unfortunately neither of these works:
1. Old nodes will ignore categorization_config when parsing metadata, but then error because Job requires an analysis_config
2. Old nodes will not tolerate an analysis_config with no detectors
This results in the messy solution that categorizer jobs will have to have an analysis_config that includes unnecessary fields - new nodes will ignore these fields and mask them when printing the config in REST responses, but old nodes will show the unnecessary bits

I think the only long term solution that allows the necessary degree of extensibility is to hold Jobs as arbitrary Map<String, Object> or BytesReference when parsing from cluster state, and only interpret what's in the Map or BytesReference if the job_type is understood. This is pretty much how index settings work.

The text was updated successfully, but these errors were encountered:

droberts195 · 2018-12-05T10:18:31Z

We decided to avoid this problem by using a completely different class to store new types of jobs that are not anomaly_detectors.

elasticmachine added :ml Machine learning >enhancement labels Apr 25, 2018

davidkyle mentioned this issue Aug 16, 2018

[ML] Metadata Migration Meta issue #32905

Closed

43 tasks

This was referenced Oct 26, 2018

[ML] Consider validating jobs outside of the builder #34899

Open

[ML] Address parsing deprecated and removed features in Datafeed Config #34858

Closed

droberts195 closed this as completed Dec 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Validate existing cluster state differently to newly submitted configs #30084

[ML] Validate existing cluster state differently to newly submitted configs #30084

elasticmachine commented Sep 5, 2017

droberts195 commented Dec 5, 2018

[ML] Validate existing cluster state differently to newly submitted configs #30084

[ML] Validate existing cluster state differently to newly submitted configs #30084

Comments

elasticmachine commented Sep 5, 2017

droberts195 commented Dec 5, 2018