Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring stats to handle custom percentiles #1477

Merged
merged 16 commits into from
Aug 13, 2020

Conversation

vstepanov-lohika-tix
Copy link
Contributor

@vstepanov-lohika-tix vstepanov-lohika-tix commented Jul 10, 2020

Goal
Currently, if we set a custom list of percentiles, we'll get a broken csv report & console stats for percentiles.
This PR is intended to:

  • Handling percentiles list in a single place
  • Get proper csv report with custom percentiles set
  • Get proper console stats with custom percentiles set
  • Decrease of duplication of percentile values in code

How-to check:
In the locust file, set the list of percentiles you need to get:

import locust.stats
locust.stats.PERCENTILES_TO_REPORT = [0.50, 0.90, 0.95, 0.99]

Both csv report & console stats should be correct with a proper columns number and data.

@codecov
Copy link

codecov bot commented Jul 10, 2020

Codecov Report

Merging #1477 into master will decrease coverage by 0.24%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1477      +/-   ##
==========================================
- Coverage   81.47%   81.23%   -0.25%     
==========================================
  Files          27       27              
  Lines        2386     2398      +12     
  Branches      366      370       +4     
==========================================
+ Hits         1944     1948       +4     
- Misses        351      358       +7     
- Partials       91       92       +1     
Impacted Files Coverage Δ
locust/stats.py 88.96% <100.00%> (-0.06%) ⬇️
locust/clients.py 90.19% <0.00%> (-4.91%) ⬇️
locust/event.py 87.50% <0.00%> (-4.40%) ⬇️
locust/main.py 18.66% <0.00%> (-0.19%) ⬇️
locust/runners.py 80.81% <0.00%> (+0.20%) ⬆️
locust/user/task.py 96.75% <0.00%> (+0.54%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9b97c74...d85e0a0. Read the comment docs.

@vstepanov-lohika-tix
Copy link
Contributor Author

Updated tests -> Passed. Moving to Open.

@vstepanov-lohika-tix vstepanov-lohika-tix marked this pull request as ready for review July 10, 2020 11:47
@vstepanov-lohika-tix
Copy link
Contributor Author

Found issue with rounding:
image
Will fix in the next commit

@vstepanov-lohika-tix
Copy link
Contributor Author

Fixed:
image

@vstepanov-lohika-tix
Copy link
Contributor Author

Fixed console stats separator to any number of percentiles:
image

@cyberw
Copy link
Collaborator

cyberw commented Jul 12, 2020

Cool stuff! I think this may be better implemented as a command line switch, but I’m not wholly against it as is. It needs a proper test case (that actually modifies the setting and ensures it changes the output), and documentation.

@cyberw
Copy link
Collaborator

cyberw commented Jul 13, 2020

@heyman what do you think? Should this be a command line switch? I think we need to give some thought to how we manage settings in general (as the list of command line settings keeps growing)

@vstepanov-lohika-tix
Copy link
Contributor Author

@cyberw Added a doc for customization of stats settings. Not sure about command-line options, but probably it's worth to add it to configuration file.

@vstepanov-lohika-tix
Copy link
Contributor Author

vstepanov-lohika-tix commented Jul 21, 2020

@cyberw @heyman Please write if something else should be added/updated

locust/stats.py Outdated
'100%',
))
console_logger.info("-" * (90 + STATS_NAME_WIDTH))
headers = ('Type', 'Name', '# reqs') + tuple([f"{round(percentile*100, 4)}%" for percentile in PERCENTILES_TO_REPORT])
Copy link
Contributor

@lhupfeldt lhupfeldt Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a max number of decimal points assumed? Rounding like this can cause different percentiles to round to the same number. If a max number of decimal points is assumed, then there should be validation of the PERCENTILES_TO_REPORT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment maximum percentile in default list is 0.99999, that matching to 3 digits after point (99.999%)
I've updated to 6 digits and believe that no-one ever will try to use percentiles with higher precision, so I think to do the additional calculation from percentile list will be redundant.

Copy link
Contributor

@lhupfeldt lhupfeldt Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will probably do, but you should still verify the PERCENTILES_TO_REPORT so that the web interface does not break.

Copy link
Contributor Author

@vstepanov-lohika-tix vstepanov-lohika-tix Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes assume that a user will manually set a list of percentiles based on the default list, and this assumes that a user understands what does he do and why. Otherwise, we should handle all possible wrong or not relevant values that could be input ( >1, <0, not a number etc). There's no reason to set a percentile precision over 99.999999% in a performance testing world, and even if it will happen - it's just will cause a wrong header name in results stats.
So, imho it's a very little sense to add additional calculation in code to cover a non-breaking issue that unlikely could occur.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In web.py there are assumptions about a few specific percentiles. I think it should be validated that those are present in the list. I agree that it is otherwise fine to assume that users will know to put relevant numbers in the list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, I'll check web.py additionally

@lhupfeldt
Copy link
Contributor

lhupfeldt commented Aug 5, 2020

stats_history_csv_header still has hardcoded list of percentiles

@lhupfeldt
Copy link
Contributor

web.py has hardcoded percentiles in different places:

"ninetieth_response_time": s.get_response_time_percentile(0.9),

                report["current_response_time_percentile_95"] = environment.runner.stats.total.get_current_response_time_percentile(0.95)
                report["current_response_time_percentile_50"] = environment.runner.stats.total.get_current_response_time_percentile(0.5)

Not sure how median plays into this?
Maybe it should be required that the percentiles referenced from web.py must be in the percentiles list.

@cyberw
Copy link
Collaborator

cyberw commented Aug 7, 2020

This looks good to me, if nobody objects, I will merge it this weekend.

@lhupfeldt
Copy link
Contributor

stats_history_csv_header still has hardcoded list of percentiles

@cyberw
Copy link
Collaborator

cyberw commented Aug 8, 2020

stats_history_csv_header still has hardcoded list of percentiles

Good point. I will hold off.

@vstepanov-lohika-tix
Copy link
Contributor Author

Will check stats_history_csv_header and web.py additionally

@vstepanov-lohika-tix
Copy link
Contributor Author

vstepanov-lohika-tix commented Aug 10, 2020

@cyberw @lhupfeldt Updated stats_history_csv_header as well and extracted repetative action to a function.
Regarding web.py - checked additionally, percentile values in this file are not linked to PERCENTILES_TO_REPORT parameter and calculated "on a fly", so looks there's no need to check those values according to this list.

@cyberw
Copy link
Collaborator

cyberw commented Aug 11, 2020

Nice. Let's give @heyman one last chance to comment before I merge :)

@cyberw cyberw merged commit d34fcc0 into locustio:master Aug 13, 2020
@cyberw
Copy link
Collaborator

cyberw commented Aug 13, 2020

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants