Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cumulative sum, moving avg and serial diff pipeline aggregation metric #10033

Merged
merged 4 commits into from
Feb 15, 2017

Conversation

ppisljar
Copy link
Member

@ppisljar ppisljar commented Jan 24, 2017

adds:
cumulative sum pipeline aggregation - closes #740
moving average pipeline aggregation -
serial diff pipeline aggregation

@ppisljar ppisljar added Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:Visualizations Generic visualization features (in case no more specific feature label is available) blocked labels Jan 24, 2017
@ppisljar ppisljar force-pushed the agg/comulative-sum branch 2 times, most recently from 55e1d09 to e52d0b6 Compare February 7, 2017 09:44
},
deserialize: function (state, agg) {
return this.makeAgg(agg, state);
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you but you can compress these 6 lines to two via:

serialize: (customMetric) => customMetric.toJSON(),
deserialize: (state, agg) => this.makeAgg(agg, state),

One minor note, I like to keep the order of params somewhat consistent and here we are switching the order of agg and state, but I'm guessing this is part or the interface and wouldn't be a simple thing to change, so fine as is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree (Regarding order of params) but as you noted its part of the interface

{
group: 'none',
name: 'metricAgg',
title: 'Metric Agg',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is Metric Agg specific to derivatives or a certain type of visualization? I think it's only used for derivates right now, but the name is so generic, it sounds like it can be used elsewhere? I guess I don't completely understand what a metric agg is. In the UI, is it the Y-Axis Aggregation spot? But only when it's not one of the aggregations listed above. Like a Count is a Metric Agg but Top Hits is some other kind of Agg? Or is only derivative a metric agg?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now its only used by derivative (and comulative sum), we'll be adding other pipeline aggregations shortly and they all use this.

its the metric you define on the derivative (either custom metric or one of the pre existing metric aggs) .... list is filtered down as not all metrics are supported with derivative (you can't calculate a derivative on top hits for example)

import { parentPipelineAggController } from './lib/parent_pipeline_agg_controller';
import { parentPipelineAggWritter } from './lib/parent_pipeline_agg_writter';

const ParentPipelineAggHelperProvider = function (Private) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to confirm I understand the term parentPipeline, it s when in the Y-Axis you have a sub aggregation, and right now is only derivatives, but you split this logic out because it will be relevant to the cumulative sum you mentioned that you are also adding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly

Copy link
Contributor

@stacey-gammon stacey-gammon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the 07bd1f5 commit specifically - LGTM

title: 'Cumulative Sum',
makeLabel: agg => makeNestedLabel(agg, 'cumulative sum'),
params: [
...parentPipelineAggHelper.params()
Copy link
Contributor

@stacey-gammon stacey-gammon Feb 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. So how is this different from derivatives? Looks the exact same but with a different name & title. The specific calculation differences must be somewhere. :)

... maybe in makeAgg: function (termsAgg, state) { the termsAgg.vis is different?

It could may be even be refactored further because the only difference between the two is three strings, that could really even be one string if you applied the same normalizations to them (lowercase for nested label, lc and spaces => underscore for name, title as is).

Maybe unnecessary, just a thought. I'm still trying to figure out how choosing cumulative sum matches to the different calculations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so yeah, its pretty much the same, but different ES query will get executed (the name defines the name used in ES query .... so this one will run cumulative_sum instead of derivative ... but apart from that they are the same.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why i don't want to refactor this further: at this point they are indeed the same (cumulative sum and derivative) but they actually have different sub options (which we don't expose right now but probably will in the future).

with the current form it'll be really easy to add the parameters that are different + still not be duplicating logic for the parameters that are the same.

@ppisljar
Copy link
Member Author

ppisljar commented Feb 7, 2017

@stacey-gammon i added moving_avg and serial_diff to this PR as well as it might make it clearer.
once you confirm this looks OK i'll start adding tests and documentation

@stacey-gammon
Copy link
Contributor

Awesome, looks good and thanks for the explanations, helped a lot.

@ppisljar ppisljar changed the title WIP: cumulative sum metric aggregation cumulative sum, moving avg and serial diff pipeline aggregation metric Feb 9, 2017
@ppisljar ppisljar added the review label Feb 9, 2017
import VisProvider from 'ui/vis';
import StubbedIndexPattern from 'fixtures/stubbed_logstash_index_pattern';

const metrics = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the UI, consider using the same division that you used for pipeline aggs. it presents well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will add it in the bucket PR (will rebase in on master once this gets merged)

@thomasneirynck
Copy link
Contributor

jenkins, test this

Copy link
Contributor

@thomasneirynck thomasneirynck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait on jenkins...

@ppisljar
Copy link
Member Author

Peter Pisljar [17:46]
are pipeline aggregations 'hard' on the cluster ?
we are adding pipeline aggregations to kibana and we allow you to do multiple levels (like 5th derivative ...) and i am wondering if this could have a big performance problems ?

Colin Goodheart-Smithe [17:49]
@ppisljar no, shouldn’t have particularly big performance issues since the pipeline aggregation only act on the final reduced results of the aggregations rather than on the documents themselves. Having said that, if you did something like 1 millisecond interval on a histogram over the range of a year then I would expect there to be issues, but I would expect that regardless of whether there are pipeline aggs involved

@ppisljar
Copy link
Member Author

jenkins, test this

1 similar comment
@ppisljar
Copy link
Member Author

jenkins, test this

@ppisljar ppisljar merged commit 723c7c1 into elastic:master Feb 15, 2017
elastic-jasper added a commit that referenced this pull request Feb 15, 2017
Backports PR #10033

**Commit 1:**
extracting to parent agg helper

* Original sha: b92535f
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T09:38:00Z

**Commit 2:**
adding comulative sum agg

* Original sha: 457338f
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T09:38:19Z

**Commit 3:**
adding serial diff and moving avg

* Original sha: c104481
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T16:26:35Z

**Commit 4:**
adding tests

* Original sha: 02b3371
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-09T12:43:20Z
ppisljar pushed a commit that referenced this pull request Feb 15, 2017
#10376)

Backports PR #10033

**Commit 1:**
extracting to parent agg helper

* Original sha: b92535f
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T09:38:00Z

**Commit 2:**
adding comulative sum agg

* Original sha: 457338f
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T09:38:19Z

**Commit 3:**
adding serial diff and moving avg

* Original sha: c104481
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-07T16:26:35Z

**Commit 4:**
adding tests

* Original sha: 02b3371
* Authored by ppisljar <peter.pisljar@gmail.com> on 2017-02-09T12:43:20Z
@olivierlambert
Copy link

@ppisljar that's great news! Any idea in which release this will be available? (therefore, when?)

@ppisljar
Copy link
Member Author

hopefully in the 5.4

@ppisljar ppisljar deleted the agg/comulative-sum branch February 20, 2017 10:49
@olivierlambert
Copy link

👍 Thanks for your quick answer!

@jimgoodwin
Copy link

How do you specify "lag" for serial diff or for moving average "window size" ?

@ppisljar
Copy link
Member Author

ppisljar commented Mar 1, 2017

click the advanced setting under your seriial diff agg and in json input enter your additional configuration like { "lag": 10 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Aggregations Aggregation infrastructure (AggConfig, esaggs, ...) Feature:Visualizations Generic visualization features (in case no more specific feature label is available) review v5.4.0 v6.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Total over time (integrals)
5 participants