Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary cache: async push_success #908

Draft
wants to merge 55 commits into
base: main
Choose a base branch
from

Conversation

autoantwort
Copy link
Contributor

@autoantwort autoantwort commented Feb 15, 2023

This results in ~10-20% faster build times on my machine.

For example building boost on my M1 mac went down from 2.948 min to 2.375 min

@autoantwort autoantwort marked this pull request as draft February 15, 2023 20:04
@autoantwort
Copy link
Contributor Author

How or when should "upload messages" (like Uploaded binaries to {count} {vendor}.) be printed?

@Thomas1664
Copy link
Contributor

Doesn't this have the same problem as #694 that the working thread might exit due to calls to check_exit or value_or_exit?

@autoantwort
Copy link
Contributor Author

Doesn't this have the same problem as #694 that the working thread might exit due to calls to check_exit or value_or_exit?

Kind of. In general we need an option to decide if a binary cache failure should be a hard error or only a warning

@Thomas1664
Copy link
Contributor

Kind of. In general we need an option to decide if a binary cache failure should be a hard error or only a warning

The problem is that we almost never can be sure that there isn't some nested API call that exits on failure. But it seems like #909 at least partially addresses this issue.

@autoantwort
Copy link
Contributor Author

Yeah but in the binary cache are nearly no hard exists. It currently also only prints warnings.

Copy link
Contributor

@ras0219-msft ras0219-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this direction; unblocking I/O work has great potential for making vcpkg much faster.

However we need to be very careful about the impacts of concurrency -- deadlocks suck :(

include/vcpkg/binarycaching.h Outdated Show resolved Hide resolved
src/vcpkg/install.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
include/vcpkg/binarycaching.h Outdated Show resolved Hide resolved
src/vcpkg.cpp Outdated
@@ -156,6 +157,7 @@ namespace vcpkg::Checks
// Implements link seam from basic_checks.h
void on_final_cleanup_and_exit()
{
BinaryCache::wait_for_async_complete();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we can do this here. This is on the critical path for ctrl-c handling and should only be used for extremely fast, emergency tear-down behavior (like restoring the console).

If there happens to be an exit anywhere in any BinaryCache implementation, this would deadlock. Importantly, this include any sort of assertion we might want to do, like checking pointers for null.

Unfortunately, the only path forward I see is to call this (or appropriately scope the BinaryCache itself) at the relevant callers. The consequence of possibly not uploading some set of binary caches in the case of some unhandled program error (such as permissions issue on a directory expected to be writable) is vastly preferable to deadlocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have changed the BinaryCache::wait_for_async_complete() implementation so it does not deadlock anymore.

I also moved the call to Checks::exit_with_code which is not called when crtl+c is handled. (I personally would like to have a way to terminate vcpkg but wait until the binary cache is done so that I don't lose progress.)

And I prefer it when build packages are uploaded to the binary caches before vcpkg exits because of an error, otherwise I have to build the already build packages again at a later point when there is no cache entry.

Copy link
Contributor

@ras0219-msft ras0219-msft Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that it is, desirable to finish uploading on "understood" errors. For example, if a package failed to build or failed to be installed.

I was also wrong about my original assessment of a deadlock. My concern was the call path of the binary upload thread calling Checks::unreachable() or .value_or_exit(), but it seems that std::thread::join() does have a carve-out to handle this specific case: it will throw a resource_deadlock_would_occur if you try to join yourself.

I've put some other concerns below, but I don't want those to distract from my main point: We must make it as trivial / correct-by-construction as possible to guarantee that the binary cache thread NEVER attempts to wait on itself. I think the best approach for vcpkg right now is to add calls from Install::perform() etc to BinaryCache::wait_for_async_complete() before any "user-facing" error, such as the exit guarded by result.code != BuildResult::SUCCEEDED && keep_going == KeepGoing::NO. This is motivated by the perspective that it's always safer to terminate than to join and possibly deadlock / race condition / etc.


There's still a UB data race if the main thread and binary upload thread attempt to exit at the same time:

Concurrently calling join() on the same thread object from multiple threads constitutes a data race that results in undefined behavior.
-- https://en.cppreference.com/w/cpp/thread/thread/join

There's also a serious "scalability" problem if we ever want a second background thread for whatever reason, because BGThread A would join on BGThread B, while BGThread B tries to join on BGThread A. This might be solvable with ever more complex structures, such as a thread ownership DAG that gets threads to join only on their direct children, but I don't think the benefit is worth the cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UB and the joining itself could simply be prevented by doing a if (std::this_thread::get_id() == instance->push_thread.get_id()). My concern with the explicit approach is that it is easy to forget to call the waiting function of the BinaryCache and every time you want to exit you have to remember to call it. This seems to me to be very prone to human error.

Copy link
Contributor Author

@autoantwort autoantwort Mar 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now implemented your request

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ras0219-msft Is there anything left that is preventing this PR from being merged?

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
@autoantwort autoantwort marked this pull request as ready for review March 5, 2023 20:28
# Conflicts:
#	src/vcpkg/binarycaching.cpp
@BillyONeal BillyONeal removed the depends:different-pr This PR depends on a different PR which has been filed label Jun 22, 2023
include/vcpkg/base/batch-quere.h Outdated Show resolved Hide resolved
include/vcpkg/base/batch-quere.h Outdated Show resolved Hide resolved
include/vcpkg/base/batch-quere.h Outdated Show resolved Hide resolved
# Conflicts:
#	src/vcpkg/commands.ci.cpp
#	src/vcpkg/commands.install.cpp
#	src/vcpkg/commands.set-installed.cpp
#	src/vcpkg/commands.upgrade.cpp
@BillyONeal BillyONeal self-assigned this Oct 4, 2023
Copy link
Contributor

@Thomas1664 Thomas1664 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicks.
I think we should use size_t instead of int when we're using a variable effectively as an index inside a loop.
Furthermore, I noticed that some variable names are just letters. For me, this is confusing because I don't know what the variable is for a few lines later.


BGMessageSink m_bg_msg_sink;
BGThreadBatchQueue<ActionToPush> m_actions_to_push;
std::atomic_int m_remaining_packages_to_push = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::atomic_int m_remaining_packages_to_push = 0;
std::atomic_size_t m_remaining_packages_to_push = 0;

I think this should be size_t because we effectively could have size_t packages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member existing at all is a smell to me. The number of items in the queue should be a part of the queue, not something tracked externally. Moreover, the queue already contains locks and stuff so I'm not sure why we need an atomic here.

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved
src/vcpkg/binarycaching.cpp Show resolved Hide resolved
src/vcpkg/base/message_sinks.cpp Outdated Show resolved Hide resolved
Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm going to tackle the structural error reporting etc. thing now that I think this change is really exposing / crying out for. In particular, I think we should consider a design which more formally associates chunks of console output with the package in question rather than a free for all on the message sink type requiring a lot of 'reverse engineer from stuff the caller already knew' kind of behavior.
  2. I think the discussion of how we are achieving thread safety across several large components like this needs the comment like I described in a comment.
  3. The comment about 'BinaryCache is supposed to hide unique_ptr' should get fixed, which will also eliminate a lot of the changes in this PR attempting to adapt to that change.

When I have a solution for (1) are you interested in me just pushing changes into this PR or do you want a PR for your PR?

(By and large I think this change is structurally good)


BinaryCache(const Filesystem& fs);
BinaryCache(const BinaryCache&) = delete;
BinaryCache(BinaryCache&&) = default;
BinaryCache(BinaryCache&&) = delete;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with this design change. BinaryCache itself already contains several unique_ptr-alikes, and intends to firewall that storage decision from its customers. If there are immovable bits the unique_ptr juggling should happen inside rather than outside.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment about 'BinaryCache is supposed to hide unique_ptr' should get fixed, which will also eliminate a lot of the changes in this PR attempting to adapt to that change.

How would you implement that? The function push_thread_main uses the members of the class, which gets moved away if you move the BinaryCache. So I guess using pimpl is the only solution here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have now put all data in an extra struct that is hold via a std::unique_ptr by the BinaryCache class.

@@ -196,23 +200,41 @@ namespace vcpkg

struct BinaryCache : ReadOnlyBinaryCache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding threading functionality to the codebase that is not effectively 'do a single function but faster', I think there needs to be a discussion in a comment of 'this is how the threads and communication between them work'.

For instance:

// compression and upload of binary cache entries happens on a single 'background' thread, `m_push_thread`
// Thread safety is achieved within the binary cache providers by:
//   1. Only using one thread in the background for this work.
//   2. Forming a queue of work for that thread to consume in `m_actions_to_push`, which maintains its own thread safety
//   3. Sending any replies from the background thread through `m_bg_msg_sink`
//   4. Ensuring any supporting data, such as tool exes, is provided before the background thread is started.
//   5. Ensuring that work is not submitted to the background thread until the corresponding `packages` directory to upload is no longer being actively touched by the foreground thread.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does your approach here survive #1076 ? Hard links mean that we have a lot less certainty on the packages directory being 'hermetic'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#1076 does not have an impact here. The files "copied" to the installed dir are not later changed by another package. Even if they are overwritten, the hard link in the packages folder would still link to the same original file.

Copy link
Contributor Author

@autoantwort autoantwort Nov 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has an impact on windows if #802 gets merged before this. Because when the packages folder gets compressed by 7z, the next feature gets tested and stuff gets removed from the installed dir and on windows you cant remove a hard link even if the linked file is only opened via another hard link -.-

PS: Not sure, but I noticed this behavior only on the windows dev drive, but maybe the normal filesystem was simply too slow so that this never happened

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was finally able to catch this situation that happens in combination of this PR with #802:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly the catch seems to only happen with this specific rocksdb.lib file and error nearly only happens with lib files.

And in general it only happens on the dev drive and not on a normal NTFS drive. Strage 😕

Comment on lines +112 to +114
// buffers messages until newline is reached
// guarded by m_print_directly_lock
std::vector<std::pair<Color, std::string>> m_unpublished;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like that? Maybe? I think what this really argues is that we really need a semi-standardized 'document' error type @ras0219-msft has been asking for ages... we're almost done getting rid of ParseControlErrorInfo which means we will finally have a unified way of handling errors and can finally look at messing with that.


BGMessageSink m_bg_msg_sink;
BGThreadBatchQueue<ActionToPush> m_actions_to_push;
std::atomic_int m_remaining_packages_to_push = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This member existing at all is a smell to me. The number of items in the queue should be a part of the queue, not something tracked externally. Moreover, the queue already contains locks and stuff so I'm not sure why we need an atomic here.


bool empty() const { return forward.empty(); }

void pop(std::vector<T>& out)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this thing being called queue given that this is how it works. Given that we expect this to be a multi producer single consumer queue, can we instead put the vector inside and note that only one thread may call pop but any number of threads may call push? That would also resolve the criticism over separate tracking atomics below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchQueue alone is not thread safe.

Given that we expect this to be a multi producer single consumer queue, can we instead put the vector inside and note that only one thread may call pop but any number of threads may call push?

I don't get what you have in mind here 😅

I don't like this thing being called queue given that this is how it works.

Do you have an idea for a better name? :)

Comment on lines +119 to +120
std::lock_guard<std::mutex> print_lk(m_print_directly_lock);
std::lock_guard<std::mutex> lk(m_published_lock);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::lock_guard<std::mutex> print_lk(m_print_directly_lock);
std::lock_guard<std::mutex> lk(m_published_lock);
std::lock_guard<std::mutex, std::mutex> lk(m_print_directly_lock, m_published_lock);

if this survives

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you actually mean std::scoped_lock.

@autoantwort
Copy link
Contributor Author

When I have a solution for (1) are you interested in me just pushing changes into this PR or do you want a PR for your PR?

You can just push to this PR. I will look through the other comments in the following days :)

# Conflicts:
#	include/vcpkg/base/message-data.inc.h
#	locales/messages.json
@autoantwort autoantwort marked this pull request as draft November 12, 2023 02:19
BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Dec 21, 2023
…ome clear that we need a thread safe channel through which errors and warnings can both be reasonably reported. Now that microsoft#1279 is landed and functionally everything in the codebase uses ExpectedL, we can look at what the new thing that fixes issues is.

Consider the following:

```c++
    ExpectedL<T> example_api(int a);

    ExpectedL<std::unique_ptr<SourceControlFile>> try_load_port_manifest_text(StringView text,
                                                                              StringView control_path,
                                                                              MessageSink& warning_sink);
```

The reason this can't return the warnings through the ExpectedL channel is that we don't want the 'error' state to be engaged when there are merely warnings. Moreover, that these channels are different channels means that situations that might want to return errors and warnings together, as happens when parsing files, means that order relationships between errors and warnings is lost. It is probably a good idea in general to put warnings and errors about the same location next to each other in the output, but that's hard to do with this interface.

Rather than multiplexing everything through the return value, this proposal is to multiplex only the success or failure through the return value, and report any specific error information through an out parameter.

1. Distinguish whether an overall operation succeeded or failed in the return value, but
2. record any errors or warnings via an out parameter.

Applying this to the above gives:

```c++
    Optional<T> example_api(MessageContext& context, int a);

    // unique_ptr is already 'optional'
    std::unique_ptr<SourceControlFile> try_load_port_manifest_text(MessageContext& context,
                                                                   StringView text,
                                                                   StringView control_path);
```

Issues this new mechanism fixes:

* Errors and warnings can share the same channel and thus be printed together
* The interface between code wanting to report events and the code wanting to consume them is a natural thread synchronization boundary. Other attempts to fix this have been incorrect by synchronizing individual print calls ( microsoft#1290 ) or complex enough that we are not sure they are correct by trying to recover boundaries by reparsing our own error output ( microsoft#908 )
* This shuts down the "error: error:" and similar bugs where it isn't clear who is formatting the overall error message vs. talking about individual components

Known issues that are not fixed by this change:

* This still doesn't make it easy for callers to programmatically handle specific types of errors. Currently, we have some APIs that still use explicit `std::error_code` because they want to do different things for 'file does not exist' vs. 'there was an I/O error'. Given that this condition isn't well served by the ExpectedL mechanism I don't want to wait until we have a better solution to it to proceed.
* Because we aren't making the context parameter the 'success carrier' it's more complex to implement 'warnings as errors' or similar functionality where the caller decides how 'important' something is. I would be in favor of moving all success tests to the context parameter but I'm not proposing that because the other vcpkg maintainers do not like it.
* Contextual information / stack problems aren't solved. However, the context parameter might be extended in the future to help with this.
BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Dec 22, 2023
…ome clear that we need a thread safe channel through which errors and warnings can both be reasonably reported. Now that microsoft#1279 is landed and functionally everything in the codebase uses ExpectedL, we can look at what the new thing that fixes issues is.

Consider the following:

```c++
    ExpectedL<T> example_api(int a);

    ExpectedL<std::unique_ptr<SourceControlFile>> try_load_port_manifest_text(StringView text,
                                                                              StringView control_path,
                                                                              MessageSink& warning_sink);
```

The reason this can't return the warnings through the ExpectedL channel is that we don't want the 'error' state to be engaged when there are merely warnings. Moreover, that these channels are different channels means that situations that might want to return errors and warnings together, as happens when parsing files, means that order relationships between errors and warnings is lost. It is probably a good idea in general to put warnings and errors about the same location next to each other in the output, but that's hard to do with this interface.

Rather than multiplexing everything through the return value, this proposal is to multiplex only the success or failure through the return value, and report any specific error information through an out parameter.

1. Distinguish whether an overall operation succeeded or failed in the return value, but
2. record any errors or warnings via an out parameter.

Applying this to the above gives:

```c++
    Optional<T> example_api(MessageContext& context, int a);

    // unique_ptr is already 'optional'
    std::unique_ptr<SourceControlFile> try_load_port_manifest_text(MessageContext& context,
                                                                   StringView text,
                                                                   StringView control_path);
```

Issues this new mechanism fixes:

* Errors and warnings can share the same channel and thus be printed together
* The interface between code wanting to report events and the code wanting to consume them is a natural thread synchronization boundary. Other attempts to fix this have been incorrect by synchronizing individual print calls ( microsoft#1290 ) or complex enough that we are not sure they are correct by trying to recover boundaries by reparsing our own error output ( microsoft#908 )
* This shuts down the "error: error:" and similar bugs where it isn't clear who is formatting the overall error message vs. talking about individual components

Known issues that are not fixed by this change:

* This still doesn't make it easy for callers to programmatically handle specific types of errors. Currently, we have some APIs that still use explicit `std::error_code` because they want to do different things for 'file does not exist' vs. 'there was an I/O error'. Given that this condition isn't well served by the ExpectedL mechanism I don't want to wait until we have a better solution to it to proceed.
* Because we aren't making the context parameter the 'success carrier' it's more complex to implement 'warnings as errors' or similar functionality where the caller decides how 'important' something is. I would be in favor of moving all success tests to the context parameter but I'm not proposing that because the other vcpkg maintainers do not like it.
* Contextual information / stack problems aren't solved. However, the context parameter might be extended in the future to help with this.
BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Dec 29, 2023
…ome clear that we need a thread safe channel through which errors and warnings can both be reasonably reported. Now that microsoft#1279 is landed and functionally everything in the codebase uses ExpectedL, we can look at what the new thing that fixes issues is.

Consider the following:

```c++
    ExpectedL<T> example_api(int a);

    ExpectedL<std::unique_ptr<SourceControlFile>> try_load_port_manifest_text(StringView text,
                                                                              StringView control_path,
                                                                              MessageSink& warning_sink);
```

The reason this can't return the warnings through the ExpectedL channel is that we don't want the 'error' state to be engaged when there are merely warnings. Moreover, that these channels are different channels means that situations that might want to return errors and warnings together, as happens when parsing files, means that order relationships between errors and warnings is lost. It is probably a good idea in general to put warnings and errors about the same location next to each other in the output, but that's hard to do with this interface.

Rather than multiplexing everything through the return value, this proposal is to multiplex only the success or failure through the return value, and report any specific error information through an out parameter.

1. Distinguish whether an overall operation succeeded or failed in the return value, but
2. record any errors or warnings via an out parameter.

Applying this to the above gives:

```c++
    Optional<T> example_api(MessageContext& context, int a);

    // unique_ptr is already 'optional'
    std::unique_ptr<SourceControlFile> try_load_port_manifest_text(MessageContext& context,
                                                                   StringView text,
                                                                   StringView control_path);
```

Issues this new mechanism fixes:

* Errors and warnings can share the same channel and thus be printed together
* The interface between code wanting to report events and the code wanting to consume them is a natural thread synchronization boundary. Other attempts to fix this have been incorrect by synchronizing individual print calls ( microsoft#1290 ) or complex enough that we are not sure they are correct by trying to recover boundaries by reparsing our own error output ( microsoft#908 )
* This shuts down the "error: error:" and similar bugs where it isn't clear who is formatting the overall error message vs. talking about individual components

Known issues that are not fixed by this change:

* This still doesn't make it easy for callers to programmatically handle specific types of errors. Currently, we have some APIs that still use explicit `std::error_code` because they want to do different things for 'file does not exist' vs. 'there was an I/O error'. Given that this condition isn't well served by the ExpectedL mechanism I don't want to wait until we have a better solution to it to proceed.
* Because we aren't making the context parameter the 'success carrier' it's more complex to implement 'warnings as errors' or similar functionality where the caller decides how 'important' something is. I would be in favor of moving all success tests to the context parameter but I'm not proposing that because the other vcpkg maintainers do not like it.
* Contextual information / stack problems aren't solved. However, the context parameter might be extended in the future to help with this.
BillyONeal added a commit to BillyONeal/vcpkg-tool that referenced this pull request Jan 26, 2024
…ome clear that we need a thread safe channel through which errors and warnings can both be reasonably reported. Now that microsoft#1279 is landed and functionally everything in the codebase uses ExpectedL, we can look at what the new thing that fixes issues is.

Consider the following:

```c++
    ExpectedL<T> example_api(int a);

    ExpectedL<std::unique_ptr<SourceControlFile>> try_load_port_manifest_text(StringView text,
                                                                              StringView control_path,
                                                                              MessageSink& warning_sink);
```

The reason this can't return the warnings through the ExpectedL channel is that we don't want the 'error' state to be engaged when there are merely warnings. Moreover, that these channels are different channels means that situations that might want to return errors and warnings together, as happens when parsing files, means that order relationships between errors and warnings is lost. It is probably a good idea in general to put warnings and errors about the same location next to each other in the output, but that's hard to do with this interface.

Rather than multiplexing everything through the return value, this proposal is to multiplex only the success or failure through the return value, and report any specific error information through an out parameter.

1. Distinguish whether an overall operation succeeded or failed in the return value, but
2. record any errors or warnings via an out parameter.

Applying this to the above gives:

```c++
    Optional<T> example_api(MessageContext& context, int a);

    // unique_ptr is already 'optional'
    std::unique_ptr<SourceControlFile> try_load_port_manifest_text(MessageContext& context,
                                                                   StringView text,
                                                                   StringView control_path);
```

Issues this new mechanism fixes:

* Errors and warnings can share the same channel and thus be printed together
* The interface between code wanting to report events and the code wanting to consume them is a natural thread synchronization boundary. Other attempts to fix this have been incorrect by synchronizing individual print calls ( microsoft#1290 ) or complex enough that we are not sure they are correct by trying to recover boundaries by reparsing our own error output ( microsoft#908 )
* This shuts down the "error: error:" and similar bugs where it isn't clear who is formatting the overall error message vs. talking about individual components

Known issues that are not fixed by this change:

* This still doesn't make it easy for callers to programmatically handle specific types of errors. Currently, we have some APIs that still use explicit `std::error_code` because they want to do different things for 'file does not exist' vs. 'there was an I/O error'. Given that this condition isn't well served by the ExpectedL mechanism I don't want to wait until we have a better solution to it to proceed.
* Because we aren't making the context parameter the 'success carrier' it's more complex to implement 'warnings as errors' or similar functionality where the caller decides how 'important' something is. I would be in favor of moving all success tests to the context parameter but I'm not proposing that because the other vcpkg maintainers do not like it.
* Contextual information / stack problems aren't solved. However, the context parameter might be extended in the future to help with this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants