Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase size of CI runner to fix system tests #483

Merged
merged 24 commits into from
Aug 6, 2024

Conversation

krschacht
Copy link
Contributor

@krschacht krschacht commented Aug 4, 2024

A couple days ago we started getting randomly failures like this:

Capybara starting Puma...
* Version 6.4.0 , codename: The Eagle of Durango
* Min threads: 0, max threads: 4
* Listening on http://1[27](https://github.com/AllYourBot/hostedgpt/actions/runs/10258652298/job/28381812667?pr=483#step:9:28).0.0.1:37807
[Screenshot Image]: /home/runner/work/hostedgpt/hostedgpt/tmp/screenshots/failures_test_refreshing_the_page_after_closing_sidebar_keeps_it_closed.png
E

Error:
NavColumnTest#test_refreshing_the_page_after_closing_sidebar_keeps_it_closed:
Net::ReadTimeout: Net::ReadTimeout with "Net::ReadTimeout with #<TCPSocket:(closed)>"
    test/application_system_test_case.rb:20:in `login_as'
    test/system/messages/nav_column_test.rb:6:in `block in <class:NavColumnTest>'

bin/rails test test/system/messages/nav_column_test.rb:86

I tried catching the exception and retrying, but the exception occurs in different parts of the code so that's not easy and probably not the right solution. I tried precompiling assets since some posts suggested this would fix, it didn't help. I spent awhile figuring out how to increase the Capybara timeout since various things online seemed to suggest that part of initializing was taking too long. I finally figured it out with this trail of clues:

But that didn't fix either. Finally, I increased the size of the github action runner just for system tests and that seems to have fixed it. I tried going back to the previous runner but with parallelization turned down to just 1 thread and that still failed.

I'm not sure why this would start happening all of the sudden. Maybe the overall resources that app is using have increased? Maybe github decreased the default (i.e. free) action runner size?

I take that back: the tests ran successfully twice in a row but then started timing out again. In the end, we're using the larger runners (4 cores) but I set parallelization to 2 and I increased the timeout just in case.

@krschacht krschacht changed the title Precompile assets before running system tests Increase size of CI runner to fix system tests Aug 6, 2024
@krschacht krschacht merged commit 4db8eb6 into main Aug 6, 2024
5 of 6 checks passed
@krschacht krschacht deleted the system-test-precompile branch August 6, 2024 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant