Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rounding of percentiles (premature optimization?) #255

Closed
rnewson opened this issue Mar 16, 2015 · 6 comments
Closed

Rounding of percentiles (premature optimization?) #255

rnewson opened this issue Mar 16, 2015 · 6 comments

Comments

@rnewson
Copy link

rnewson commented Mar 16, 2015

I noticed that all my percentile values appear to be rounded. On investigation, it's deliberate (commit 7f0b2d6) for bandwidth reasons.

Is it true that sending 58000 is more efficient than sending 58459? Is it true enough to warrant discarding this much data?

@heyman
Copy link
Member

heyman commented Mar 16, 2015

Hi!

It's not that sending the integer 58000 takes up less space than 58459. We construct a dict whose keys are the response time, and the values are the number of requests that are completed in that response time.

So let's say we have a test running whose response times are between 200 ms and 20k ms. If you run that test long enough (without the rounding), you're going to have a dict that contains almost 20k keys, and dicts that have at least a couple of thousand keys are going to be sent from each slave node to the master node every third second (and you might have many hundreds of slave nodes). Compare this to the current version (when we do rounding) where the dict wouldn't have more than 310 keys.

Also, I can't see any load testing scenario where the lost precision of the rounding would matter.

@rnewson
Copy link
Author

rnewson commented Mar 16, 2015

ok, I see how that reduces the size of the dict, though it raises further questions (for another time).

Perhaps the reporting of percentiles should 'x' out the values to more clearly demonstrate that the values have been rounded? I first starting looking at it when I saw all my stats rounded.

@heyman
Copy link
Member

heyman commented Mar 16, 2015

The reasoning behind it is that +/-10ms rounding is okay for for requests between 100 ms and 10k ms, and +/-100 is OK for >1k < 10k, etc. There might be cases where you would benefit from knowing that the 95th percentile is exactly 58459 ms rather than between 57500 and 58499, but if they exist they should be extremely rare.

I don't think we should "X out" the zeroes (since that might complicate further processing of the data), but it could probably be a good idea to document this behaviour better.

@heyman heyman closed this as completed Mar 16, 2015
@rehevkor5
Copy link

I am not sure that the histogram bucketing approach is necessarily a problem, although I believe there are probably ways to implement a fully accurate report at the end without worrying about streaming data to a master node during test execution. And I think that the UI should communicate some kind of accuracy estimate, such as "+/- X" where X is some value.

Specifically, I think the target of opportunity is the logic here

for response_time in sorted(response_times.keys(), reverse=True):
in calculate_response_time_percentile which ceilings the latency to the bucket rather than interpolating. The idea is that there's a difference (mathematically speaking) between the number of requests in the percentile vs. the number of requests in the buckets. If the difference is 0% of the last bucket, then indeed the bucket is the value. However, if the difference is 99% of the last bucket, then the value should be far closer to the previous bucket.

You can see an example of this in Prometheus' code for histogram_quantile https://github.com/prometheus/prometheus/blob/main/promql/quantile.go#L114

You may also be interested in Micrometer's approach to choosing the bucket values, in this class: https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/distribution/PercentileHistogramBuckets.java

However, Locust has no particular need to use hard-coded histogram buckets. That being the case, it might make more sense to use a more accurate summary mechanism such as http://hdrhistogram.org/

@cyberw
Copy link
Collaborator

cyberw commented Oct 6, 2021

Hi @rehevkor5 ! That sounds like a nice improvement. I’ve always felt that the percentiles are not accurate enough. Do you have time to look into implementing it?

@rehevkor5
Copy link

Cool. I will put it on my list but I can't make any guarantees :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants