Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSA: Full CI time down to ~30 minutes #1761

Closed
refack opened this issue Apr 13, 2019 · 5 comments
Closed

PSA: Full CI time down to ~30 minutes #1761

refack opened this issue Apr 13, 2019 · 5 comments

Comments

@refack
Copy link
Contributor

refack commented Apr 13, 2019

After the optimization in AIX, Windows, and RPIs we are down to about 30 minutes
πŸ’ƒ πŸ‘―β€β™‚οΈ πŸŽ‰
image
Also reduced multiplicity of Windows test from 5 sub-jobs to 3, and ARM-fanned from 6 to 5.

@joaocgreis
Copy link
Member

This is quite nice!

@refack I took a look at the Windows job changes and have some questions/comments:

  • the increase in speed is achieved only by using clcache right? I see the build is still of release type, it should always be the same as the official released binaries.
  • the new job calls test.py directly, but vcbuild should be changed to have the js and native targets like the Makefile. This has been on my backlog for quite some time, is this in yours as well or is there a reason not to do it?
  • the "Do not fail the build on empty test results" checkbox is ticked on both test jobs. Won't this allow for false positives? Do you know if there is any situation where this is needed?
  • we have to get clcache into Ansible...
  • having two test jobs is quite a nice way of organizing the job. Creates some repetition, but the differences are so many that it pays off. I might end up extending this.

@refack
Copy link
Contributor Author

refack commented Apr 18, 2019

My comment below are just RE the Windows job. To get to ~30 minutes, we also optimized the AIX job, the ARM native job, and the SmartOS job, and dropped the RPI1 to experimental.


  • the increase in speed is achieved only by using clcache right? I see the build is still of release type, it should always be the same as the official released binaries.

The increase is due to 3 incremental changes:

  1. More use of PreCompiledHeaders (huge speed up by itself) - about 50% of compile time
  2. clcache - about 30% more
  3. Splitting the addons test so it is run only once - that one in itself is ~80% speed up of the test job.
  • the new job calls test.py directly, but vcbuild should be changed to have the js and native targets like the Makefile. This has been on my backlog for quite some time, is this in yours as well or is there a reason not to do it?

Yeah, now that the new job is 1:1 with node12 I'd also like to have the test logic in vcbuild or something similar that is part of the code tree.

  • the "Do not fail the build on empty test results" checkbox is ticked on both test jobs. Won't this allow for false positives? Do you know if there is any situation where this is needed?

The job script tests by itself if the tests did output. Because we don't have the latest tap2junit on all machines, sometimes we just don't have the xUnit output for Jenkins to parse and dispaly.

  • we have to get clcache into Ansible...

Yep, and an updated tap2junit

  • having two test jobs is quite a nice way of organizing the job. Creates some repetition, but the differences are so many that it pays off. I might end up extending this.

I worked on trying to make 1 job that will fit 12 and 6-11, but it was too much noise, so for 6-11 the test is just as it was, and for node 12 the new job was slimmed down extensively.

Optimally I'd say the test script should be part of the code tree, so the test jobs will only need to call is, but sometimes you want to play with the script and stabilize it and only then land it in node-core.

@joaocgreis
Copy link
Member

The job script tests by itself if the tests did output. Because we don't have the latest tap2junit on all machines, sometimes we just don't have the xUnit output for Jenkins to parse and dispaly.

In that case, when a test fails, how does the job turn red? (False positives are the number one thing I worry about when looking at Jenkins jobs, in my opinion it's the worst thing that can happen...)

@refack
Copy link
Contributor Author

refack commented Apr 18, 2019

In that case, when a test fails, how does the job turn red?

exit /B 1

Or more spesificaly:

if not exist test.tap (
  echo "Missing test.tap file"
  set "EXIT_RETURN_VALUE=1"
)
set CCEXIT_RETURN_VALUE=0
if "%SHOULD_HAVE_CCTEST%" == "1" (
  set CCEXIT_RETURN_VALUE=1
  if exist cctest.tap (
    echo "Found cctest.tap file"
    set "CCEXIT_RETURN_VALUE=0"
  )
  if exist out\junit\cctest.xml (
    echo "Found out\junit\cctest.xml file"
    set "CCEXIT_RETURN_VALUE=0"
  )
)
if %CCEXIT_RETURN_VALUE% EQU 1 exit /b 1

exit /b %EXIT_RETURN_VALUE%

@joaocgreis
Copy link
Member

I see now, the job looks good πŸ‘

What I was worried about was if a test fails with a real not ok on a machine without tap2junit. Before, this was done by the test output parser, but I see now we use the return code of test.py.

@refack refack pinned this issue Jun 2, 2019
@rvagg rvagg unpinned this issue Jul 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants