-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add concurrency
option
#69
Add concurrency
option
#69
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please change the PR title to "Add concurrency option" ?
I would suggest adding a smoke test with options.concurrency
explicitly set just to make sure future changes don't break the p-all
behaviour.
Everything else LGTM.
index.js
Outdated
@@ -31,6 +33,13 @@ const preprocessDestinationPath = (source, destination, options) => { | |||
|
|||
module.exports = (source, destination, options = {}) => { | |||
const progressEmitter = new EventEmitter(); | |||
let concurrencyDegree; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you create a new var to make options.concurrency
default to (os.cpus().length || 1) * 2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the new variable to make it explicit in reading what the value meant, do you think it's unnecessary? I think having it like this also makes it easier to find the default value. I could make it so no variable is added, or changing it such that the default value is abstracted to a variable and pass it with {concurrency: options.concurrency || defaultConcurrency}
instead, my only concern was the readability of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what @hugomrdias is saying (and giving you a hint) is that you can simply the code. For example, you may leverage object spread to have default options being mixed in with the ones the user supplies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not sure what you mean with object spread, since this code change operates with only one of the options. The other options are handled in other functions.
I could default it like:
module.exports = (source, destination, options = {concurrency: (os.cpus().length || 1 * 2}) => [...]
...which also makes the default value easy to find. I'm not sure if this would work. Do you think that would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right in the sense that each option is currently being handled in each function, so perhaps lets keep the consistency.
Still, we may simplify with:
const concurrency = options.concurrency || (os.cpus().length || 1) * 2;
Also, by renaming the variable to concurrency
from concurrencyDegree
, you may use it directly when passing it to pAll
.
Other than that, the PR looks good. However, I suggest adding a simple test (smoke test) to see if the concurrency is being respected.
About the test, i'm unsure how to proceed. I could make the function return the concurrency value and test that, but i feel like changing the return is beyond the scope of this PR. Otherwise i'm not sure how concurrency could be tested. Maybe by timing the operation, as in p-map/test.js, but that seems highly variable. |
Yep it's hard to test in this case... if we were using |
…est for bad concurrency values
I don't think it needs testing. We should trust that
You can stub with AVA too. You just have to bring your favorite stubbing library. |
Looks good :) |
Regarding default value for concurrency: it is somewhat unexpectedly based on completely unrelated number of cpus in system. It doesn't make any sense, and (though I didn't test it) I'd expect it to slow things down considerably for modern systems(on the order of hundred times for small files on SSD on system with 2-4 cores). Ideal default value should be based on kernel defined number of open descriptors and "pipeline width"(which actually doesn't exist as a single number, as it depends on type of operations performed) of all devices file systems we work with are on. The point of using concurrency is to reduce fd pressure on system, and also reduce memory usage, but typical values we'd get with this default(2-128 range, with 2 for micro VMs and 128 for huge servers) are far below values which will apply serious pressure(compared to pressure introduced by nodejs itself), yet usually far below the values which will saturate io(probably in range of 64 - 1024 for similar systems). So, I'd advise to just set it to some relatively high constant number(my guess before was 1024), which will both saturate IO for almost all systems, yet won't be too taxing on fds/mem. Just to amplify all that: basing IO concurrency on number of CPUs is complete bullshit. |
Fixes #34.
Switches Promise.All for p-all, and adds concurrency option with suggested default value.