Increase size of CI runner to fix system tests #483

krschacht · 2024-08-04T13:52:05Z

A couple days ago we started getting randomly failures like this:

Capybara starting Puma...
* Version 6.4.0 , codename: The Eagle of Durango
* Min threads: 0, max threads: 4
* Listening on http://1[27](https://github.com/AllYourBot/hostedgpt/actions/runs/10258652298/job/28381812667?pr=483#step:9:28).0.0.1:37807
[Screenshot Image]: /home/runner/work/hostedgpt/hostedgpt/tmp/screenshots/failures_test_refreshing_the_page_after_closing_sidebar_keeps_it_closed.png
E

Error:
NavColumnTest#test_refreshing_the_page_after_closing_sidebar_keeps_it_closed:
Net::ReadTimeout: Net::ReadTimeout with "Net::ReadTimeout with #<TCPSocket:(closed)>"
    test/application_system_test_case.rb:20:in `login_as'
    test/system/messages/nav_column_test.rb:6:in `block in <class:NavColumnTest>'

bin/rails test test/system/messages/nav_column_test.rb:86

I tried catching the exception and retrying, but the exception occurs in different parts of the code so that's not easy and probably not the right solution. I tried precompiling assets since some posts suggested this would fix, it didn't help. I spent awhile figuring out how to increase the Capybara timeout since various things online seemed to suggest that part of initializing was taking too long. I finally figured it out with this trail of clues:

driven_by directive is defined: https://github.com/rails/rails/blob/9ba208c16835f4a174ae9fd385ebc18972d758a4/actionpack/lib/action_dispatch/system_test_case.rb#L158
the options parameter is passed on through to the driver: https://github.com/rails/rails/blob/main/actionpack/lib/action_dispatch/system_testing/driver.rb#L56
the options[:timeout] value is read by Capybara: https://github.com/teamcapybara/capybara/blob/0480f90168a40780d1398c75031a255c1819dce8/lib/capybara/selenium/driver.rb#L67
the PersistentClient inherits from HttpDefault: https://github.com/teamcapybara/capybara/blob/master/lib/capybara/selenium/patches/persistent_client.rb#L5
the read_timeout: is set from the initializer: https://github.com/SeleniumHQ/selenium/blob/trunk/rb/lib/selenium/webdriver/remote/http/default.rb#L36

But that didn't fix either. Finally, I increased the size of the github action runner just for system tests and that seems to have fixed it. I tried going back to the previous runner but with parallelization turned down to just 1 thread and that still failed.

I'm not sure why this would start happening all of the sudden. Maybe the overall resources that app is using have increased? Maybe github decreased the default (i.e. free) action runner size?

I take that back: the tests ran successfully twice in a row but then started timing out again. In the end, we're using the larger runners (4 cores) but I set parallelization to 2 and I increased the timeout just in case.

krschacht added 4 commits August 4, 2024 08:51

Precompile assets before running system tests

3a15be9

fix command typo

cb09b7b

Add retry

ca74594

fix indent

1a4896a

krschacht mentioned this pull request Aug 4, 2024

Add "Forgot Password" reset feature using Postmark #470

Merged

krschacht added 14 commits August 4, 2024 09:19

remove retry & upgrade gem

1bc8f0b

attempt to increase timeout

1bcb548

set timeout a diff way

735d8fd

again

02c8d62

bigger timeout?

41e70f4

Merge branch 'main' into system-test-precompile

265d75f

large runner

e0f20cd

remove precompile

9479f2d

output the hack to confirm it's wroking

58df64a

remove timeout increase

84e01ca

worked on big? try less parallel on small

38cb148

increase timeout

d2921d3

large runner again

ed020db

default timeout

d0f1671

krschacht changed the title ~~Precompile assets before running system tests~~ Increase size of CI runner to fix system tests Aug 6, 2024

krschacht added 6 commits August 5, 2024 20:39

only run on my repo

a965e7a

increase timeout

a92c9cb

revert again

9ff90ea

another config: longer timeout + 2 workers

f94b4e6

add echo notice

220fea2

better notice

8fd51ba

krschacht merged commit 4db8eb6 into main Aug 6, 2024
5 of 6 checks passed

krschacht deleted the system-test-precompile branch August 6, 2024 02:19

krschacht added a commit that referenced this pull request Aug 11, 2024

Revert "Increase size of CI runner to fix system tests (#483)"

b04902d

robacarp mentioned this pull request Aug 23, 2024

adds ssh debugger for selenium test #498

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase size of CI runner to fix system tests #483

Increase size of CI runner to fix system tests #483

krschacht commented Aug 4, 2024 •

edited

Loading

Increase size of CI runner to fix system tests #483

Increase size of CI runner to fix system tests #483

Conversation

krschacht commented Aug 4, 2024 • edited Loading

krschacht commented Aug 4, 2024 •

edited

Loading