Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AzDO Networking issue impacting multiple builds #8593

Closed
missymessa opened this issue Mar 10, 2022 · 15 comments
Closed

AzDO Networking issue impacting multiple builds #8593

missymessa opened this issue Mar 10, 2022 · 15 comments

Comments

@missymessa
Copy link
Member

missymessa commented Mar 10, 2022

Issue for tracking the intermittent, inconsistent networking errors we're encountering in our builds.

https://portal.microsofticm.com/imp/v3/incidents/details/292951370/home

{
   "errorMessage" : "net/http: request canceled while waiting for connection"
}

Report

Build Definition Step Name Console log Pull Request
58474 dotnet/runtime Initialize containers Log dotnet/runtime#74963

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 0 1
@ulisesh ulisesh added Critical FC - Infrastructure A build failure caused by apparent infrastructure failures. Known Build Error labels Mar 10, 2022
@ulisesh ulisesh changed the title Tracking issue for AzDO Networking issue IcM AzDO Networking issue impacting multiple builds Mar 10, 2022
@ilyas1974
Copy link
Contributor

I have created Azure support ticket 2203110010001681 to try and help address this.

@AraHaan
Copy link
Member

AraHaan commented Mar 13, 2022

Is there perhaps plans for a public version of the link above so we have a general idea on what the cause of the issue might be?

@AraHaan
Copy link
Member

AraHaan commented Mar 14, 2022

good news it seems my pr is now unblocked by this: #8604

@markwilkie
Copy link
Member

Hi @AraHaan - I don't think there's a way to provide public access to our internal issue tracking system. :( In this case, the title kinda says it all. We're having intermittent issues w/ connectivity and the Azure folks are trying to get to the bottom of it.

Great to see your PRs seems to have worked!

@adiaaida and/or @mmitche - would one of you mind taking a look at the PR? (looks good to me)

@michellemcdaniel
Copy link
Contributor

I don't have context on the file being changed, so hopefully Matt can look?

@ilonatommy
Copy link
Member

ilonatommy commented Mar 16, 2022

I am adding some new cases affected by this error.

  1. Installer Build and Test coreclr Linux_arm64 Release, Pipelines - Run 20220314.2.
    error:
docker run -v /mnt/vss/_work/1/s:/root/runtime -w=/root/runtime -e VSS_NUGET_URI_PREFIXES -e VSS_NUGET_ACCESSTOKEN mcr.microsoft.com/dotnet-buildtools/prereqs:rhel-7-rpmpkg-c982313-20174116044113 ./build.sh --ci --subset packs.installers /p:BuildRpmPackage=true /p:Configuration=Release /p:TargetOS=Linux /p:TargetArchitecture=arm64 /p:RuntimeFlavor=coreclr /p:RuntimeArtifactsPath=/root/runtime/artifacts/transport/coreclr /p:RuntimeConfiguration=release /p:LibrariesConfiguration=Release /bl:artifacts/log/Release/msbuild.rpm.installers.binlog 
Unable to find image 'mcr.microsoft.com/dotnet-buildtools/prereqs:rhel-7-rpmpkg-c982313-20174116044113' locally 
docker: Error response from daemon: Get "https://mcr.microsoft.com/v2/": net/http: request canceled while waiting for connection
  1. Installer Build and Test coreclr Linux_musl_x64 Release, Pipelines - Run 20220314.22, Mono Product Build Linux x64 debug Run 20220316.68
    error:
docker: error pulling image configuration: Get "https://westus2.data.mcr.microsoft.com/01031d61e1024861afee5d512651eb9f-h36fskt2ei//docker/registry/v2/blobs/sha256/d3/d3358c58cff96d0874e70d9ef680e5c69a452079d7d651f9e441c48b62a95144/data?se=2022-03-14T18%3A52%3A56Z&sig=7CM6Q6E1lL%2F07ifd%2FR1VVO%2BRlBbCH%2FiCs8V%2Fki%2BvxXE%3D&sp=r&spr=https&sr=b&sv=2016-05-31&regid=01031d61e1024861afee5d512651eb9f": dial tcp 131.253.33.219:443: i/o timeout. 
  1. Build Android arm Release AllSubsets_Mono, Pipelines - Run 20220314.1, Build Browser wasm Linux Release LibraryTests_EAT, Pipelines - Run 20220315.4, Build Linux x64 Release AllSubsets_Mono_LLVMJIT Run 20220316.68.
    error:
Error response from daemon: Get "[https://mcr.microsoft.com/v2/"](https://mcr.microsoft.com/v2/%22): net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
##[error]Docker pull failed with exit code 1

@ilyas1974
Copy link
Contributor

Per information from @agocke , there is a problem with the CDN behind MCR.

@garath
Copy link
Member

garath commented Mar 16, 2022

Per information from, there is a problem with the CDN behind MCR.

Tracking in 295259702

@ilyas1974
Copy link
Contributor

After doing some additional research, we could not find any recent instances of this error. Should it occur again, we will open another issue with the MCR team

@AraHaan
Copy link
Member

AraHaan commented Mar 31, 2022

@AlitzelMendez
Copy link
Member

@AlitzelMendez AlitzelMendez reopened this May 4, 2022
@ilyas1974 ilyas1974 removed the FC - Infrastructure A build failure caused by apparent infrastructure failures. label May 9, 2022
@ilyas1974 ilyas1974 removed their assignment Jun 22, 2022
@markwilkie
Copy link
Member

Do y'all think this should be a known build error, or marked as critical? My impression is that the hit count is low enough that it doesn't meet the critical bar....but not sure if we have a bar yet.... :) Thoughts @ilyas1974 ?

@ilyas1974
Copy link
Contributor

We set a bar of 200 jobs impacted before we engage a partner team (metric review back in May).

@ilyas1974
Copy link
Contributor

As there have not been any instances of this issue for the last 7 days, I am closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests