Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase timeouts range in both worker and bundle-manager to help all… #4421

Merged
merged 1 commit into from
Mar 5, 2023

Conversation

AndrewJGaut
Copy link
Contributor

@AndrewJGaut AndrewJGaut commented Mar 3, 2023

In a previous PR, I adjusted the incorrect timeout in the bundle-manager. I correct that here and increase the timeout for the worker on its checkin.

@AndrewJGaut
Copy link
Contributor Author

AndrewJGaut commented Mar 3, 2023

Completion time of stress tests:

Run 1
--- Completion Time: 152.82416373093923 minutes---
{"_test_large_bundle_upload": [1654.1023645401], "_test_large_bundle_result": [1369.4457926750183], "_test_many_gpu_runs": [1160.1446483135223], "_test_multiple_cpus_runs_count": [1184.5576422214508], "_test_many_bundle_uploads": [355.61165142059326], "_test_many_worksheet_copies": [728.1842911243439], "_test_parallel_runs": [2200.3067133426666], "_test_many_docker_runs": [505.5193784236908], "_test_infinite_memory": [0.14172840118408203], "_test_infinite_gpu": [0.13679075241088867], "_test_infinite_disk": [0.13950204849243164], "_test_many_disk_writes": [0.9853329658508301]}

Run 2
--- Completion Time: 145.26957790851594 minutes---
{"_test_large_bundle_upload": [1688.2870078086853], "_test_large_bundle_result": [1452.253629207611], "_test_many_gpu_runs": [881.2471234798431], "_test_multiple_cpus_runs_count": [793.2172269821167], "_test_many_bundle_uploads": [365.5718801021576], "_test_many_worksheet_copies": [753.9356782436371], "_test_parallel_runs": [2195.8937842845917], "_test_many_docker_runs": [582.8920478820801], "_test_infinite_memory": [0.14975595474243164], "_test_infinite_gpu": [0.1504807472229004], "_test_infinite_disk": [0.1434488296508789], "_test_many_disk_writes": [0.725210428237915]}

Looks like total completion time was now just 2.5 hours, down from the ~4 hours it was before the recent issues and the 6-12 hours it was before this change!

@AndrewJGaut AndrewJGaut marked this pull request as ready for review March 4, 2023 22:54
@AndrewJGaut AndrewJGaut merged commit 1c634d4 into master Mar 5, 2023
@AndrewJGaut AndrewJGaut deleted the adjust-timeouts branch March 5, 2023 01:37
This was referenced Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants