-
-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Maximum Resident Set tracking #321
Conversation
It isn't complete but should work well as proof of concept |
Thank you very much for your contribution! I am planning to take a closer look soon. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution. I added a few comments.
I am inclined to accept such a change, but only if
- it is thoroughly tested (not necessarily a unit test, but at least some manual tests on Linux and Windows)
- all questions here and in the ticket can be answered. in particular, the question about the intermediate shell and the precise working mechanism of
getrusage
. - it nicely integrates within the code base (the namings of some structs/types/variables/functions are currently focused on times, not memory consumption)
src/hyperfine/benchmark.rs
Outdated
@@ -74,13 +78,15 @@ pub fn time_shell_command( | |||
time_real = subtract_shell_spawning_time(time_real, spawning_time.time_real); | |||
time_user = subtract_shell_spawning_time(time_user, spawning_time.time_user); | |||
time_system = subtract_shell_spawning_time(time_system, spawning_time.time_system); | |||
max_rss = subtract_shell_spawning_time(max_rss, spawning_time.max_rss); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so this subtracts the memory required by the shell? In this case, we should probably give the function a new name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I did the least amount of changes to get something going, if you have idea on function names please tell me :)
src/hyperfine/timer/unix_timer.rs
Outdated
@@ -55,6 +55,7 @@ fn get_cpu_times() -> CPUTimes { | |||
+ i64::from(result.ru_utime.tv_usec), | |||
system_usec: i64::from(result.ru_stime.tv_sec) * MICROSEC_PER_SEC | |||
+ i64::from(result.ru_stime.tv_usec), | |||
max_rss_byte: i64::from(result.ru_maxrss), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From man 2 getrusage
: "in kilo-bytes"
ru_maxrss (since Linux 2.6.32)
This is the maximum resident set size used (in kilo‐
bytes). For RUSAGE_CHILDREN, this is the resident set
size of the largest child, not the maximum resident set
size of the process tree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I'm not sure what to make of the "For RUSAGE_CHILDREN, this is the resident set size of the largest child, not the maximum resident set size of the process tree" comment. Does this mean that we would not get the total amount of memory used by the tree
+ /bin/sh
+--+ command-to-be-benchmarked
but rather the memory usage of EITHER the shell or the command-to-be-benchmarked?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder now if libc::wait4 wouldn't give us better information.
src/hyperfine/timer/unix_timer.rs
Outdated
@@ -63,6 +64,7 @@ fn cpu_time_interval(start: &CPUTimes, end: &CPUTimes) -> CPUInterval { | |||
CPUInterval { | |||
user: ((end.user_usec - start.user_usec) as f64) * 1e-6, | |||
system: ((end.system_usec - start.system_usec) as f64) * 1e-6, | |||
max_rss: ((end.max_rss_byte - start.max_rss_byte) as f64) * 1e-6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we used kibibytes and mebibytes here, i.e. divide by 1024 or 1024². All references of MegaBytes
should then be changed to KibiBytes
or MebiBytes
. In the output we would ideally use "KiB" or "MiB".
This is to avoid the confusion with "KB" and "MB" where it's unclear if they refer to the SI standard (as they should, in my opinion), or if it's an old Unix program which uses "KB" and "MB" when they actually refer to "1024 byte" or "1024² byte".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds correct.
src/hyperfine/benchmark.rs
Outdated
|
||
if options.output_style != OutputStyleOption::Disabled { | ||
println!( | ||
" Time ({} ± {}): {:>8} ± {:>8} [User: {}, System: {}]", | ||
" Time ({} ± {}): {:>8} ± {:>8} [User: {}, System: {}, MaxRSS: {}]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty sure that most users would not know what "MaxRSS" refers to. Maybe just "Mem:" and then explain the precise meaning in the --help
text and man page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good idea.
src/hyperfine/benchmark.rs
Outdated
@@ -358,16 +371,18 @@ pub fn run_benchmark( | |||
|
|||
let (user_str, user_unit) = format_duration_unit(user_mean, None); | |||
let system_str = format_duration(system_mean, Some(user_unit)); | |||
let max_rss_str = format!("{:.3} MB", max_rss_mean); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if we would show either KiB or MiB (depending on the actual usage), with fewer digits in the fractional part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably a format_memory
function would make that better :)
Closing this for now. I plan to restructure the hyperfine code a little bit and will keep this use-case in mind. |
Great, let me know if I can help you :) I prepared https://crates.io/crates/wait4 a while ago to improve a bit the backend, but then I had to focus on other tasks. |
Thank you. This PR is not forgotten. But probably won't be merged in this form.
Nice! I still don't fully understand whether it would be possible to get the memory usage of the process in question, given that we launch it through an intermediate shell (related: #336). |
IF we have the intermediate shell we'd have to use getrusage(RUSAGE_CHILDREN) and then subtract what wait4() is telling for the shell PID, I guess. Worst case we'd have to know by some mean the PID of the program we care about and pass it to the wait4 or equivalent. |
I played a bit and I discovered that wait4(child_pid) is behaving exactly like |
@lu-zero Have you tested that on Windows ? I'm testing your branch and the result is wrong
For a process which takes 2 Gb |
Fixes #86