-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new interval based cost function #2972
new interval based cost function #2972
Conversation
Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices
1c08dd5
to
9d9b3eb
Compare
return 0; | ||
} | ||
} | ||
static final double HALF_LIFE = 24.0; // cost function half-life in hours |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it'd be really nice to have some comments about what everything means and how the algo works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
half life is by definition ln(2) / lambda
i.e. the time difference that will make the joint cost go down by half
👍 if we have more comments about how the algorithm works |
@fjy added a few more comments + links to the PR description in the code |
return intervalCost(y0, y0, y1) + | ||
intervalCost(beta, beta, gamma) + | ||
// cost of exactly overlapping intervals of size beta | ||
2 * (beta + FastMath.exp(-beta) - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or I suppose this is just the solution to the integral?
Can the comment be described as such?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@drcrallen this can't be a constant, since beta depends on interval start / end. And yes, this is the solution to
\int_0^{\beta} \int_{0}^{\beta} e^{|x-y|}dxdy = 2 \cdot (\beta + e^{-\beta} - 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add some comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the constant out front before I realized its just the equation solution
{ | ||
final long gapMillis = gapMillis(segment1.getInterval(), segment2.getInterval()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can probably also delete static method gapMillis now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already removed it.
👍 |
* new interval based cost function Addresses issues with balancing of segments in the existing cost function - `gapPenalty` led to clusters of segments ~30 days apart - `recencyPenalty` caused imbalance among recent segments - size-based cost could be skewed by compression New cost function is purely based on segment intervals: - assumes each time-slice of a partition is a constant cost - cost is additive, i.e. cost(A, B union C) = cost(A, B) + cost(A, C) - cost decays exponentially based on distance between time-slices * comments and formatting * add more comments to explain the calculation
Backport balancing improvements apache#2910 apache#2964 apache#2972
Awesome work! Can't wait to migrate. As a little teaser, could you maybe let me know how much impact this new balancing strategy has on the query performance? |
@xvrl could you comment why did you use |
It's interesting because according to https://issues.apache.org/jira/browse/MATH-1422 "FastMath" may not actually be faster, but OTOH is is supposed to be more precise. FYI @dgolitsyn |
@leventov I only used |
Our current segment balancing algorithm uses a cost function which has some unintended effects on the distribution of segments in the cluster. While it keeps the total number of segments roughly balanced across nodes, it does not evenly distribute segments within a given interval. This leads to many queries hitting only a fraction of the historical nodes in the cluster.
This is a proposal to revamp the cost function to address those shortcoming and improve overall segment distribution.
TL;DR here's how segment distribution has improved in the days since we rolled out this improved cost function to one of our Druid clusters (each dot is a segment, each color represents a different data source).
Issues with our existing cost function
Here is a breakdown of the issues with the current cost function
Segments tend to form clusters with large gaps in between
If you have take segments roughly 30 days apart, the existing
gapPenalty
causes the cost of putting a segment anywhere between those two to be higher than adding a segment that overlaps with them. This causes the balancing to converge to form clusters of segments ~30 days apart as seen below.Segments for recent time intervals tend to be distributed unevenly
The recencyPenalty, in combination with the gapPenalty, makes the cost function non-trivial for recent segments and causes strange gaps of data within the most recent 7 days worth of data.
The base joint cost between two segments is tied to segment byte sizes. This causes cost to be affected by different levels compression, and not necessarily tied to the actual cost of scanning a segment.
Improved cost function
Our improved cost function is designed to be simple and have the following properties:
To satisfy 1. we will assume that the cost of scanning a unit time-slice of data is constant. This means our cost function will only depend on segment intervals.
To model the joint cost of querying two time-slices and satisfy 2., we use an exponential decay
exp(-lambda * t)
, where lambda defines the rate at which the interaction decays, and t is the relative time differences between two to time slices. Currently lambda is set to give the decay a half-life of 24 hours.Since we assume the cost of querying each time-slice is constant, we can compute the joint cost of two segments by simply integrating over the segment intervals, i.e. for two segments X and Y covering intervals [x_0, x_1) and [y_0, y_1) respectively, our joint cost becomes:
This has the nice property of being additive as in 3. and of having a closed form solution.
One nice side-effect of additivity is that re-indexing the same data at different segment granularities (e.g. going form hourly segments to daily segments) will not affect the joint cost, assuming the cost of scanning a unit of time stays the same per shard.
Segments of the same datasources are of course more likely to be queried simultaneously, so we multiply the cost by a constant factor (2 in this case) if the two segments are from the same data source. This ensures that if two datasources are distributed equally, we are more likely to spread them across servers.
Hopefully this all makes sense, but comments are of course welcome.
Regarding performance, the new cost function is more expensive to compute than the existing one, but thanks to the optimizations in #2910 it will not be worse than what we have in our current release, and maybe even a little bit faster still.
The results of rolling out this new cost function speak for themselves. You can see how segment distribution has massively improved in our Druid cluster for the problem cases depicted above.