Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularly reset iOS/tvOS Simulators #9274

Closed
premun opened this issue May 6, 2022 · 9 comments
Closed

Regularly reset iOS/tvOS Simulators #9274

premun opened this issue May 6, 2022 · 9 comments
Assignees

Comments

@premun
Copy link
Member

premun commented May 6, 2022

Context

The Simulators are slowing down as they are being used. The folder with their data grows substantially. I did an analysis and found sizes up to 12 GB, most frequently ranging from 2 to 7 GB. This is a potential cause of the slowdown.

To be done

XHarness has a command that wipes the Simulator data and resets the Simulator. We should do this repeatedly on our OSX queues to keep them healthy.

A good fit might be the weekly OSOB pipeline?

Notes

Simulators can be reset via following Helix job:

sudo rm -rf /Users/helix-runner/.dotnet # some machines have global .NET installed and there are permission problems
curl -sL aka.ms/get-xharness | bash -
export DOTNET_ROOT=./.dotnet

# Reset iOS
sid=`./xharness apple device ios-simulator-64`
du -sh ~/Library/Developer/CoreSimulator/Devices/$sid
./xharness apple simulators reset-simulator -o . -t ios-simulator-64
du -sh ~/Library/Developer/CoreSimulator/Devices/$sid

# Reset tvOS
sid=`./xharness apple device tvos-simulator`
du -sh ~/Library/Developer/CoreSimulator/Devices/$sid
./xharness apple simulators reset-simulator -o . -t tvos-simulator
du -sh ~/Library/Developer/CoreSimulator/Devices/$sid"
Build Kind Start Time
1863024 Rolling 2022-06-07
1865333 Rolling 2022-07-07
1867636 Rolling 2022-08-07
@premun premun added the Needs Triage A new issue that needs to be associated with an epic. label May 6, 2022
@premun premun removed the Needs Triage A new issue that needs to be associated with an epic. label May 6, 2022
@karelz
Copy link
Member

karelz commented Jun 24, 2022

Impact:

  • 1x 1845742 (6/25 PM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 1x 1845315 (6/25 AM) - "Test run timed out after 45 minute(s)" failure
  • 1x 1844615 (6/24 PM) - "Test run timed out after 45 minute(s)" failure
  • 1x 1843409 (6/24 AM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 2x 1842392 (6/23 PM) - "Test run timed out after 30 minute(s)" failure
  • 1x 1841039 (6/23 AM) - "Test run timed out after 30 minute(s)" failure
  • 1x 1841039 (6/23 AM) - "Test run timed out after 45 minute(s)" failure
  • 1x 1840029 (6/22 PM) - "Test run timed out after 45 minute(s)" failure
  • 1x 1837529 (6/21 PM) - "Test run timed out after 45 minute(s)" failure
  • 1x 1835035 (6/20 PM) - "Test run timed out after 45 minute(s)" failure
  • 2x 1832621 (6/18 PM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 1x 1830538 (6/17 AM) - "Test run timed out after 30 minute(s)" failure
  • 2x 1829491 (6/16 PM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" + "fail: Failed to find/create suitable simulator" failure
  • 4x 1828326 (6/16 AM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 1x 1828326 (6/16 AM) - "Test run timed out after 30 minute(s)" failure
  • 1x 1827345 (6/15 PM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 1x 1823354 (6/15 AM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure
  • 2x 1823354 (6/14 AM) - "Test run timed out after 30 minute(s)" failure
  • 1x 1823354 (6/14 AM) - "fail: Cancelling the run after 180 seconds as application failed to launch in time" failure

@greenEkatherine
Copy link
Contributor

Every build for Build tvOS arm64 Release AllSubsets_Mono job has an AzDO error: "The job running on agent Azure Pipelines xx ran longer than the maximum time of 180 minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134".

I see that this fix wasn't yet released, it should be done today cc @jtschuster

Builds:
1867636 7/8 AM

1866699 7/7 PM

1865333 7/7 AM

1864252 7/6 PM

1863024 7/6 AM

1861843 7/5 PM

1860642 7/5 AM

1860343 7/4 PM

1859672 7/4 AM

1859371 7/3 PM

1859092 7/3 AM

1858774 7/2 PM

1858309 7/2 AM

1857518 7/1 PM

1856450 7/1 AM

1855530 6/30 PM

@premun
Copy link
Member Author

premun commented Jul 19, 2022

@greenEkatherine this is just too many jobs hitting the queue. Nothing to be done from our side.

Maybe too many runs of extra-platforms?

@premun
Copy link
Member Author

premun commented Jul 19, 2022

This PR moves runtime onto 12.00 with newer simulators that have never been reset. Let's see if this alleviates the issue

@premun premun self-assigned this Jul 19, 2022
@carlossanlop
Copy link
Member

Saw this problem today in a tvOS job for runtime-extra-platforms: https://dev.azure.com/dnceng-public/public/_build/results?buildId=38971&view=logs&j=f4520fb1-1559-5885-1d9c-3cb3f6a85e23&t=6c7a8cfe-f92e-569a-eef9-b2ad3e13056d

[14:58:16] dbug: [TCP tunnel] Xamarin.Hosting: Failed to connect to port 60652 (60652) on device (error: 61)
[14:58:16] dbug: [TCP tunnel] Xamarin.Hosting: Attempting USB tunnel between the port 60652 on the device and the port 60652 (60652) on the mac: 61

The changes in the PR are unrelated: dotnet/runtime#76565

@premun
Copy link
Member Author

premun commented Oct 5, 2022

@carlossanlop I don't think this is related as this issue is about simulators but your job is from a real device

@carlossanlop
Copy link
Member

carlossanlop commented Oct 5, 2022

@premun Thanks for the confirmation. Should I open a new issue in this repo, or do you know if there's an existing one for real devices? cc @steveisok

@premun
Copy link
Member Author

premun commented Oct 6, 2022

@steveisok is there a tracking issue on your side for the TCP issues?

@premun
Copy link
Member Author

premun commented Feb 8, 2023

The original reset we talked about and are not going to do but the unrelated issue discussed below has been resolved by #11700

@premun premun closed this as completed Feb 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants