-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to fix macOS hang when building Python 2 #1343
Conversation
Rather than opening separate file handles, and reading the first part with one and the rest with the other, introduce best_effort_read_contents_if_shebang which implements this test directly when doing the first read buffer. Also fixes read_contents to avoid needing to copy the entire file down by 3 bytes if the file starts with a BOM using the same idea.
ea0ca67
to
d40dc6f
Compare
I kicked off 3 copies of the osx-x64 build to see if this helps: |
Initial results are promising since all 3 of these got past Python2 without issues but without a full understanding of the cause of the problem it is difficult to be sure. |
|
||
std::vector<std::future<void>> workers; | ||
workers.reserve(num_threads - 1); | ||
WorkCallbackContext(F init_f, size_t work_count) : work(init_f), work_count(work_count), next_offset(0) { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WorkCallbackContext(F init_f, size_t work_count) : work(init_f), work_count(work_count), next_offset(0) { } | |
WorkCallbackContext(F init_f, size_t work_count) : work(std::move(init_f)), work_count(work_count), next_offset(0) { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a functor/callable, which are usually pointers, and for which std::move is a pessimization. (This is also why they are passed by value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why taking it by rvalue reference would be a pessimization, but how would std::move reduce performance? For simple pointers, isn't it the same as a copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::move
forces that the value be bound to an rvalue-reference. This is why it, for example, breaks NRVO in return std::move(blah);
cases. It is likely that it wouldn't matter in this case; I'm just used to 'pass the functors by value' resulting from the mess that caused us to add _Pass_fn
: passing functors by reference broke the ability to inline through function pointers. Maybe my paranoia here is out of date by now but given that we control all the functors here and aren't going to ever pass ones that are expensive to copy I'm inclined to keep following the convention.
auto offset = next_offset.load(std::memory_order_relaxed); | ||
while (offset < work_count) | ||
{ | ||
if (!next_offset.compare_exchange_weak(offset, offset + 1, std::memory_order_relaxed)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use fetch_add
instead, since we can assume that work_count + threads < SIZE_MAX
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can make that assumption. (fetch_add
was @Thomas1664 's initial solution and I requested this form to avoid overflow)
I suppose we can assume this is unlikely and add other edge cases for it though I'm not sure that would be better... stand by
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use
fetch_add
instead, since we can assume thatwork_count + threads < SIZE_MAX
?
work_count
could reach SIZE_MAX
. I remember that some PRs that used smaller types in context of the install plan were rejected.
if (ptp_work) | ||
{ | ||
auto max_threads = (std::min)(work_count, static_cast<size_t>(get_concurrency())); | ||
max_threads = (std::min)(max_threads, SIZE_MAX - work_count); // to avoid overflow in fetch_add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_threads = (std::min)(max_threads, SIZE_MAX - work_count); // to avoid overflow in fetch_add | |
if (SIZE_MAX - work_count > max_threads) Checks::unreachable(VCPKG_LINE_INFO); |
I don't think trying to be fancy here to support more than 18,446,744,073,709,550,000-ish (or 4,294,966,000-ish) items is worth it.
Also, I think the current line is pessimistic by 1 item/thread -- if work_count == SIZE_MAX
, 1 thread is permissible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think trying to be fancy here to support more than 18,446,744,073,709,550,000-ish (or 4,294,966,000-ish) items is worth it.
I think it is worth it, at least as long as 32 bit is not dead. (And arm-linux is still very much a thing, even if we won't get much parallelism there)
Also, I think the current line is pessimistic by 1 item/thread -- if work_count == SIZE_MAX, 1 thread is permissible.
Sure. (Note that work_count == SIZE_MAX
is impossible because the end()
element can't exist in the address space. :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Moreover, a dependency from this stuff on Checks:: makes this more complex, not less)
std::future
.