Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_runner: fix timeout in *Each hook failing further tests #48925

Conversation

rluvaton
Copy link
Member

Fix #48917

@nodejs-github-bot
Copy link
Collaborator

Review requested:

  • @nodejs/test_runner

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Jul 25, 2023
@rluvaton rluvaton requested a review from MoLow July 25, 2023 19:43
Copy link
Member

@MoLow MoLow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rluvaton
Copy link
Member Author

Cc @atlowChemi

@atlowChemi atlowChemi added the request-ci Add this label to start a Jenkins CI on a PR. label Jul 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@atlowChemi atlowChemi added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Jul 27, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 27, 2023
@nodejs-github-bot
Copy link
Collaborator

@atlowChemi
Copy link
Member

@rluvaton the new tests seem to be failing https://ci.nodejs.org/job/node-test-commit-linux/nodes=fedora-latest-x64/53007/consoleText

        not ok 7 - test-runner/output/timeout_in_before_each_should_not_affect_further_tests.js
          ---
          duration_ms: 772.821225
          failureType: 'testCodeFailure'
          error: |-
            Expected values to be strictly equal:
            + actual - expected ... Lines skipped
            
              'gonna timeout\n' +
                'TAP version 13\n' +
            ...
                '# Subtest: after each timeout\n' +
                '    # Subtest: second describe first test\n' +
            +   '    ok 1 - second describe first test\n' +
            -   '    not ok 1 - second describe first test\n' +
                '      ---\n' +
                '      duration_ms: *\n' +
            +   '      ...\n' +
            +   '    # Subtest: second describe second test\n' +
            +   '    ok 2 - second describe second test\n' +
            +   '      ---\n' +
            +   '      duration_ms: *\n' +
            -   "      failureType: 'hookFailed'\n" +
            -   "      error: 'failed running afterEach hook'\n" +
            -   "      code: 'ERR_TEST_FAILURE'\n" +
            -   '      stack: |-\n' +
            -   '        async Promise.all (index 0)\n' +
                '      ...\n' +
            +   '    1..2\n' +
            +   'ok 2 - after each timeout\n' +
            +   '  ---\n' +
            +   '  duration_ms: *\n' +
            +   "  type: 'suite'\n" +
            +   '  ...\n' +
            +   '1..2\n' +
            +   '# tests 4\n' +
            +   '# suites 2\n' +
            +   '# pass 3\n' +
            +   '# fail 1\n' +
            -   '    # Subtest: second describe second test\n' +
            -   '    ok 2 - second describe second test\n' +
            -   '      ---\n' +
            -   '      duration_ms: *\n' +
            -   '      ...\n' +
            -   '    1..2\n' +
            -   'not ok 2 - after each timeout\n' +
            -   '  ---\n' +
            -   '  duration_ms: *\n' +
            -   "  type: 'suite'\n" +
            -   "  failureType: 'subtestsFailed'\n" +
            -   "  error: '1 subtest failed'\n" +
            -   "  code: 'ERR_TEST_FAILURE'\n" +
            -   '  ...\n' +
            -   '1..2\n' +
            -   '# tests 4\n' +
            -   '# suites 2\n' +
            -   '# pass 2\n' +
            -   '# fail 2\n' +
                '# cancelled 0\n' +
                '# skipped 0\n' +
                '# todo 0\n' +
                '# duration_ms *\n'
          code: 'ERR_ASSERTION'
          name: 'AssertionError'
          expected: |-
            gonna timeout
            TAP version 13
            not gonna timeout
            before each test second
            after each test first
            gonna timeout
            # Subtest: before each timeout
                # Subtest: first describe first test
                not ok 1 - first describe first test
                  ---
                  duration_ms: *
                  failureType: 'hookFailed'
                  error: 'failed running beforeEach hook'
                  code: 'ERR_TEST_FAILURE'
                  stack: |-
                    async Promise.all (index 0)
                  ...
                # Subtest: first describe second test
                ok 2 - first describe second test
                  ---
                  duration_ms: *
                  ...
                1..2
            not ok 1 - before each timeout
              ---
              duration_ms: *
              type: 'suite'
              failureType: 'subtestsFailed'
              error: '1 subtest failed'
              code: 'ERR_TEST_FAILURE'
              ...
            after each test second
            not gonna timeout
            # Subtest: after each timeout
                # Subtest: second describe first test
                not ok 1 - second describe first test
                  ---
                  duration_ms: *
                  failureType: 'hookFailed'
                  error: 'failed running afterEach hook'
                  code: 'ERR_TEST_FAILURE'
                  stack: |-
                    async Promise.all (index 0)
                  ...
                # Subtest: second describe second test
                ok 2 - second describe second test
                  ---
                  duration_ms: *
                  ...
                1..2
            not ok 2 - after each timeout
              ---
              duration_ms: *
              type: 'suite'
              failureType: 'subtestsFailed'
              error: '1 subtest failed'
              code: 'ERR_TEST_FAILURE'
              ...
            1..2
            # tests 4
            # suites 2
            # pass 2
            # fail 2
            # cancelled 0
            # skipped 0
            # todo 0
            # duration_ms *
            
          actual: |-
            gonna timeout
            TAP version 13
            not gonna timeout
            before each test second
            after each test first
            gonna timeout
            # Subtest: before each timeout
                # Subtest: first describe first test
                not ok 1 - first describe first test
                  ---
                  duration_ms: *
                  failureType: 'hookFailed'
                  error: 'failed running beforeEach hook'
                  code: 'ERR_TEST_FAILURE'
                  stack: |-
                    async Promise.all (index 0)
                  ...
                # Subtest: first describe second test
                ok 2 - first describe second test
                  ---
                  duration_ms: *
                  ...
                1..2
            not ok 1 - before each timeout
              ---
              duration_ms: *
              type: 'suite'
              failureType: 'subtestsFailed'
              error: '1 subtest failed'
              code: 'ERR_TEST_FAILURE'
              ...
            after each test second
            not gonna timeout
            # Subtest: after each timeout
                # Subtest: second describe first test
                ok 1 - second describe first test
                  ---
                  duration_ms: *
                  ...
                # Subtest: second describe second test
                ok 2 - second describe second test
                  ---
                  duration_ms: *
                  ...
                1..2
            ok 2 - after each timeout
              ---
              duration_ms: *
              type: 'suite'
              ...
            1..2
            # tests 4
            # suites 2
            # pass 3
            # fail 1
            # cancelled 0
            # skipped 0
            # todo 0
            # duration_ms *
            
          operator: 'strictEqual'
          stack: |-
            assertSnapshot (/home/iojs/build/workspace/node-test-commit-linux/test/common/assertSnapshot.js:38:12)
            async Module.spawnAndAssert (/home/iojs/build/workspace/node-test-commit-linux/test/common/assertSnapshot.js:66:3)
            async TestContext.<anonymous> (file:///home/iojs/build/workspace/node-test-commit-linux/test/parallel/test-runner-output.mjs:66:5)
            async Test.run (node:internal/test_runner/test:582:9)
            async Promise.all (index 6)
            async Suite.run (node:internal/test_runner/test:849:7)
            async startSubtest (node:internal/test_runner/harness:204:3)
          ...

@rluvaton
Copy link
Member Author

rluvaton commented Jul 27, 2023

@atlowChemi This is really weird as it passed in the GitHub Actions CI and locally...
Can you rerun it and see if it's flaky?

Is there any difference between Jenkins and GitHub Actions

@atlowChemi
Copy link
Member

atlowChemi commented Jul 27, 2023

Can you rerun it and see if it's flaky?

IMHO if we know it is flaky we should attempt to fix it and avoid merging a flaky test 🙂
Also, this doesn't seem flaky per-se, it failed on same exact issue consistently between all the jobs attempting to run it

@atlowChemi
Copy link
Member

atlowChemi commented Jul 27, 2023

Is there any difference between Jenkins and GitHub Actions

I truly don't know enough to answer that. What I can say is searching for test runner output in all the Github actions showed no results

EDIT: OK, took the time to have a look at this 🙂
The Github actions runs the make run-ci command:

This runs build ci & make test-ci:

node/Makefile

Lines 578 to 579 in 4b3d964

run-ci: build-ci
$(MAKE) test-ci -j1
which will execute a specific list of subfolders:

node/Makefile

Lines 545 to 559 in 4b3d964

.PHONY: test-ci
# Related CI jobs: most CI tests, excluding node-test-commit-arm-fanned
test-ci: LOGLEVEL := info
test-ci: | clear-stalled bench-addons-build build-addons build-js-native-api-tests build-node-api-tests doc-only
out/Release/cctest --gtest_output=xml:out/junit/cctest.xml
$(PYTHON) tools/test.py $(PARALLEL_ARGS) -p tap --logfile test.tap \
--mode=$(BUILDTYPE_LOWER) --flaky-tests=$(FLAKY_TESTS) \
$(TEST_CI_ARGS) $(CI_JS_SUITES) $(CI_NATIVE_SUITES) $(CI_DOC)
out/Release/embedtest 'require("./test/embedding/test-embedding.js")'
$(info Clean up any leftover processes, error if found.)
ps awwx | grep Release/node | grep -v grep | cat
@PS_OUT=`ps awwx | grep Release/node | grep -v grep | awk '{print $$1}'`; \
if [ "$${PS_OUT}" ]; then \
echo $${PS_OUT} | xargs kill -9; exit 1; \
fi

node/Makefile

Lines 511 to 515 in 4b3d964

JS_SUITES ?= default
NATIVE_SUITES ?= addons js-native-api node-api
# CI_* variables should be kept synchronized with the ones in vcbuild.bat
CI_NATIVE_SUITES ?= $(NATIVE_SUITES) benchmark
CI_JS_SUITES ?= $(JS_SUITES) pummel
And if you have a look at the output of the workflow you can see indeed it doesn't execute the parallel tests: https://github.com/nodejs/node/actions/runs/5661148767/job/15338439104?pr=48925#step:6:5504
image

@rluvaton
Copy link
Member Author

Thank you @atlowChemi for the detailed explanation!

@MoLow
Copy link
Member

MoLow commented Jul 28, 2023

@atlowChemi JS_SUITES is default - that does include parallel tests.

@rluvaton
Copy link
Member Author

JS_SUITES is default - that does include parallel tests.

so it is running?

@rluvaton
Copy link
Member Author

IMHO if we know it is flaky we should attempt to fix it

of course, just wanna know what I'm dealing with... if it's flaky running 1K times would help fail the test locally, if not then other things should be done...

@atlowChemi
Copy link
Member

@atlowChemi JS_SUITES is default - that does include parallel tests.

That is defined in test.py? Thanks for clarifying 🙂
Any explanation as to why it doesn't fail, or why it doesn't show up in the search? @MoLow

@atlowChemi atlowChemi added the request-ci Add this label to start a Jenkins CI on a PR. label Jul 28, 2023
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jul 28, 2023
@MoLow
Copy link
Member

MoLow commented Jul 28, 2023

@atlowChemi JS_SUITES is default - that does include parallel tests.

That is defined in test.py? Thanks for clarifying 🙂 Any explanation as to why it doesn't fail, or why it doesn't show up in the search? @MoLow

yes, here:

node/tools/test.py

Lines 1562 to 1563 in 48345d0

if len(args) == 0 or 'default' in args:
def_suites = [s for s in suites if s not in IGNORED_SUITES]

it doesn't show up since the reporter/process indicator used (via -p actions) prints just dots, and the test passed in gh actions.
remember flakiness has many factors and can be effected from different hardware

@rluvaton rluvaton deleted the timeout-once-in-before-each-should-keep-running branch August 1, 2023 06:32
pluris pushed a commit to pluris/node that referenced this pull request Aug 6, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
pluris pushed a commit to pluris/node that referenced this pull request Aug 7, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Ceres6 pushed a commit to Ceres6/node that referenced this pull request Aug 14, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
UlisesGascon pushed a commit to UlisesGascon/node that referenced this pull request Aug 14, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
RafaelGSS pushed a commit that referenced this pull request Aug 15, 2023
PR-URL: #48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
@UlisesGascon UlisesGascon mentioned this pull request Aug 15, 2023
RafaelGSS pushed a commit to RafaelGSS/node that referenced this pull request Aug 15, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
@RafaelGSS RafaelGSS added the backport-requested-v20.x PRs awaiting manual backport to the v20.x-staging branch. label Aug 17, 2023
@RafaelGSS
Copy link
Member

Due to fact, #48877 didn't land cleanly on v20.x-staging. This PR somehow, depends on it. So we'll need a manual backport.

Reference: https://github.com/nodejs/node/blob/main/doc/contributing/backporting-to-release-lines.md

rluvaton added a commit to rluvaton/node that referenced this pull request Aug 18, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
@rluvaton rluvaton added backport-open-v20.x Indicate that the PR has an open backport and removed backport-requested-v20.x PRs awaiting manual backport to the v20.x-staging branch. labels Aug 18, 2023
@rluvaton
Copy link
Member Author

rluvaton added a commit to rluvaton/node that referenced this pull request Sep 4, 2023
PR-URL: nodejs#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
UlisesGascon pushed a commit that referenced this pull request Sep 10, 2023
PR-URL: #48925
Backport-PR-URL: #49225
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
@UlisesGascon UlisesGascon mentioned this pull request Sep 10, 2023
@targos targos added backported-to-v20.x PRs backported to the v20.x-staging branch. and removed backport-open-v20.x Indicate that the PR has an open backport labels Nov 12, 2023
targos pushed a commit that referenced this pull request Nov 27, 2023
PR-URL: #48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
PR-URL: nodejs/node#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
sercher added a commit to sercher/graaljs that referenced this pull request Apr 25, 2024
PR-URL: nodejs/node#48925
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Reviewed-By: Chemi Atlow <chemi@atlow.co.il>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. backported-to-v20.x PRs backported to the v20.x-staging branch. commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.
Projects
None yet
7 participants