Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback when server launch fail due to create mutex error #8000

Merged

Conversation

rokonec
Copy link
Contributor

@rokonec rokonec commented Sep 22, 2022

Fixes #7993, dotnet/runtime#75867

Context

There were reported issues, when mutex logic failed with unhandled IOException.

Changes Made

Extend TryLaunchServer try-catch scope to Include opening mutex and if it throws fallback to non-server behavior.

Testing

Unit test, CI.

@rokonec rokonec requested a review from AR-May September 22, 2022 11:57
@rokonec rokonec added the merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now. label Sep 23, 2022
@marcpopMSFT
Copy link
Member

@rokonec we want this in asap I think and flowing to an rtm sdk build that other teams can try.

@danmoseley
Copy link
Member

Looks like you're closing the runtime repo bug. Do we still need a bug to figure out why it's timing out? Someone else mentioned it wasn't documented.

@rokonec
Copy link
Contributor Author

rokonec commented Sep 24, 2022

I feel confident about those changes so I will go ahead and merge it now.

@rokonec rokonec merged commit c7d7585 into dotnet:main Sep 24, 2022
@rokonec
Copy link
Contributor Author

rokonec commented Sep 24, 2022

@danmoseley

Do we still need a bug to figure out why it's timing out?

I would still like to know root cause. Yes it is not documented why would creating named mutex throw IOException with "Timeout" message. I think it might deserve investigation to better understand severity of this.

@danmoseley
Copy link
Member

It would be great if you could open a runtime repo bug describing the configurations where you've seen this. Even if it's not actionable right now.

@AR-May
Copy link
Member

AR-May commented Oct 5, 2022

It would be great if you could open a runtime repo bug describing the configurations where you've seen this. Even if it's not actionable right now.

Yes, @rokonec is on it.

Forgind pushed a commit that referenced this pull request Oct 10, 2022
…or. (#8024)

Fixes #7993

Summary
In the MSBuild server, we identify the state of the server via mutexes. Sometimes, for reason yet unknown to us, mutex could throw the exception System.IO.IOException: Connection timed out. When this occurs, we fallback to old behavior building without server. We fixed some of those in #8000, but now found more situations when this happens.

Customer Impact
MSBuild non-Windows users could have intermittent error when building with dotnet build.
This does not affect Visual Studio.

Regression?
Yes, this is a regression.

Testing
Unit tests.

Risk
Low risk. The fix adds additional try-catch blocks to process this situation.

Code Reviewers
[TODO]

Description of the fix
Add a try-catch block to catch and process the IOException exception when mutexes are used.
Add a new client exit type for this kind of error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge-when-branch-open PRs that are approved, except that there is a problem that means we are not merging stuff right now.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSBuild Server fallback mechanism doesn't work when Mutex throws exception
6 participants