Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of runtime test build scripts #58762

Merged
merged 11 commits into from
Oct 6, 2021
Merged

Conversation

trylek
Copy link
Member

@trylek trylek commented Sep 7, 2021

The primary objective of this change is to move most test build logic from the pair of scripts src/tests/build.cmd / sh to the common MSBuild project src/tests/build.proj to reduce maintenance cost incurred by the two scripts that need to be developed in sync, slightly increase performance (especially in small incremental scenarios) by reducing the number of separate MSBuild launches and move closer to the ability to build / run runtime tests from the root build script.

As of publishing this PR I have the change working locally in CoreCLR on Windows and Linux, from now on I need to rely much more on lab testing exercising various build combinations. I believe the change is now ready for at least general conceptual feedback regarding organization of the change. Once the testing gets clean I'll remove the WIP tag and I'll ask for a thorough review of the finalized change.

Thanks

Tomas

/cc @dotnet/runtime-infrastructure

@ghost
Copy link

ghost commented Sep 7, 2021

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

The primary objective of this change is to move most test build logic from the pair of scripts src/tests/build.cmd / sh to the common MSBuild project src/tests/build.proj to reduce maintenance cost incurred by the two scripts that need to be developed in sync, slightly increase performance (especially in small incremental scenarios) by reducing the number of separate MSBuild launches and move closer to the ability to build / run runtime tests from the root build script.

As of publishing this PR I have the change working locally in CoreCLR on Windows and Linux, from now on I need to rely much more on lab testing exercising various build combinations. I believe the change is now ready for at least general conceptual feedback regarding organization of the change. Once the testing gets clean I'll remove the WIP tag and I'll ask for a thorough review of the finalized change.

Thanks

Tomas

/cc @dotnet/runtime-infrastructure

Author: trylek
Assignees: -
Labels:

area-Infrastructure-coreclr

Milestone: -

@jkoritzinsky
Copy link
Member

I haven’t had a chance to really take a close look, but I have one request from what I’ve looked at. Can we please move the native build back into the cmd/sh scripts? Debugging the native build gets more difficult when the main entry-point is through MSBuild, and we aren’t even sharing the Windows and Linux halves of the build, so it doesn’t seem worth moving into MSBuild at the moment.

@trylek
Copy link
Member Author

trylek commented Sep 7, 2021

Thanks @jkoritzinsky for your initial feedback, I think that makes perfect sense at this point, I'll modify the change as you suggest.

@ViktorHofer
Copy link
Member

Can we please move the native build back into the cmd/sh scripts? Debugging the native build gets more difficult when the main entry-point is through MSBuild

IIRC @jkotas mentioned in the past that exposing only the msbuild entry point for the runtime tests should be fine.

@jkoritzinsky
Copy link
Member

If we were to go the route of moving the CMake steps into the MSBuild projects in the future, we should use the Arcade CMake SDK that I designed for that purpose.

@fanyang-mono
Copy link
Member

fanyang-mono commented Sep 8, 2021

The mono mobile targets related change looks fine to me so far. @trylek If you need help investigating the failures for mono mobile targets CI lanes, feel free to let me know. Otherwise, I will review in details when you remove the WIP tag.

@trylek
Copy link
Member Author

trylek commented Sep 16, 2021

Thanks @fanyang-mono, I believe I have reached the stage where I have all CoreCLR tests working and some Mono tests too, there are just two remaining failure buckets I haven't yet managed to get my head around and I would definitely appreciate your help:

  1. For the two arm64 test legs there's some problem with building the native components; the gen-buildsys.sh script claims the targeting architecture is wrong even though it does seem to get passed in the command-line argument.

  2. For the two arm legs there's a configuration inconsistency - we're pulling down a debug version of the framework (just like in pre-existing PR runs without my change) but we seem to be expecting release during the LLVM AOT Compile CoreCLR tests step. Have you got any idea where we're supposed to be overriding the LibraryConfiguration property in this case? In the scripts it defaults to release and I haven't yet found out where we've been switching it over to Debug previously.

Thanks a lot

Tomas

@trylek
Copy link
Member Author

trylek commented Sep 17, 2021

As a side note, I just found out that the test wrapper build is still hosed on Linux, on Windows it's fine now. I'm about to call it a day for now, I'll investigate this tomorrow.

@trylek
Copy link
Member Author

trylek commented Sep 20, 2021

@fanyang-mono, @naricc - I believe I have fixed the most obvious infra issues regarding Mono runs but apparently some failures still remain - could you please take a quick look when you have a chance and help me understand whether some of these are known issues and / or whether there are remaining problems in my change like not passing all the necessary properties to issues.targets so that it doesn't filter out some tests known not to work?

Thanks

Tomas

@naricc
Copy link
Contributor

naricc commented Sep 22, 2021

@trylek So, it looks like what is going wrong is that RuntimeVariant is not being propagated to issues.targets. There are a bunch of tests that are only excluded in certain mono variants, like interpreter. And it seems like they are incorrectly still being run with those variants.

I don't see exactly what in your change is causing that problem though. I will look more tomorrow if you don't figure it out before then.

@trylek trylek force-pushed the TestBuildCleanup branch 2 times, most recently from 51b5fa8 to e539d51 Compare September 23, 2021 13:58
@trylek trylek changed the title WIP: Cleanup of runtime test build scripts Cleanup of runtime test build scripts Sep 28, 2021
@trylek
Copy link
Member Author

trylek commented Sep 28, 2021

With Jeremy's help I have finally managed to nail down the last problem with my change - on Linux, I was generating the XUnit test wrappers twice, first in the correct wrapper generation step and also in the next CORE_ROOT population step which was missing the RuntimeVariant property. I have removed the WIP specifier as I believe the change is now ready to be reviewed and merged in.

Thanks

Tomas

@trylek
Copy link
Member Author

trylek commented Sep 29, 2021

Apologies about the additional churn, triggering the outerloop and R2R pipelines has opened a Pandora's box I haven't yet been able to fully address; I believe that the test priority distinction is now handled correctly but still a problem remains, some tests cannot find their build artifacts, I'm still investigating what is causing this.

Improve passing of parameters to MSBuild; fix native build on Windows

Add detection of __RepoRootDir to eng/native/build-commons.sh

Fix additional _build-commons.sh source needed for handle_arguments

Attempt at fixing the secondary bash source for native build

Native build passing on both Windows and Unix

Combine NativeBuild and BatchRestorePackages into a new TestBuild target

Add managed build to the overarching TestBuild target

Unify handling of __Priority between Windows and Unix

Fix naming variance of the SkipPackageRestore flag

More fixes for package restoration

Fix build target on Unix

Test pipeline full moved to msbuild on Windows

Simplify the target CreateTestOverlay

Simplify CreateTestOverlay target

Fix naming of the CORE_ROOT variable

Fix missing parameters in build.sh (ProjectFilesDir, BinDir, TestBinDir)

Fix build of 1st managed test group

Fix passing CORE_ROOT in publishdependency.targets

Improve condition for the BuildTestWrappers target

Add support for MsgPrefix & ErrMsgPrefix; centralize build step conditions

Revert part of the change pertaining to native test component build per Jeremy's PR feedback; fix processor count on Unix

Fix logic around SkipTestWrappers in build.sh (put it in sync with build.cmd)

Clean up logic around package restoration

Exporting [Err]MsgPrefix as environment vars due to issues with escaping; may need to do the same for all the properties

Fix skipping native build in BuildTestWrappersOnly mode

Skip native build in GenerateLayoutOnly mode

Fix CopyNativeTestBinaries step

Fix support for __Exclude (issues.targets); simplify property passing

Remove the irregularity between CopyNativeProjectBinaries vs. __CopyNativeProjectsAfterCombinedTestBuild

Fix passing of RuntimeFlavor

Fix copynativeonly mode

Put build step conditions back to the individual targets

Typo

Fix dependency graph w.r.t. RestorePackages

Don't suppress package restore as part of the native artifact copying step

Fix native test component build broken due to subtle property changes

Fix logic around copynativeonly

Run restore when copying over native components

Clean up copying native test components

Fix bug in construction of CORE_ROOT in RunTests

Fix incorrect merge

Fix issues.targets on Unix; add instrumentation to test wrapper build

More instrumentation

More instrumentation

Typo

One more typo

Fix copying of native test references

Skip managed test build in 'buildtestwrappersonly' mode

Fix support for __GenerateLayoutOnly

Remove temporary instrumentation; fix handling of RunWithAndroid flag

Don't run the 'standard' native & managed test build in MonoAot mode

Fix MonoBinDir

Remove incorrect setting of _CMDDIR; put back instrumentation to understand issues with test wrappers on Linux

Fix manipulation of XunitTestBinBase on Linux

Put copynativeonly mode in sync between Windows and Unix

Remove instrumentation after confirming that test wrappers now build fine on Unix

Fix issues.targets exclusions on Unix per Nathan's PR feedback

Add binlog production to msbuild invocation on Unix

Clean up some obsoleted variables in build.sh

Put back incorrectly removed __CommonMSBuildArgs; remove __BuildProperties

Fix typo in issues.targets

Fix missing path separator in issues.targets

Add trailing directory separator to XunitTestBinBase; normalize paths

Typo

Instrumentation to figure out why the runtime variant info gets dropped in some legs

Fix logging

More cleanup and instrumentation to nail down the RuntimeVariant bug

Add support for step-specific log names in test build

Typo

Fix erroneous use of the new buildLogRootName arg in the base build.sh

Fix layout generation to stop automatically generating test wrappers

Typo

Keep binlog for local instrumentations only, not as the default

Fix framework crossgenning on Linux and perfmap support

One more fix for framework crossgenning

Fix test priority propagation

Migrate common properties to Directory.Build.props

Force out-of-proc msbuild execution for building managed test groups

Fix library configuration in child msbuild execution
@trylek
Copy link
Member Author

trylek commented Oct 1, 2021

OK, I believe I'm out of the woods now, PR is clean and so is most of outerloop and R2R run, please review when you have a chance!

eng/common/msbuild.sh Outdated Show resolved Hide resolved
src/tests/Directory.Build.props Outdated Show resolved Hide resolved
src/tests/build.cmd Show resolved Hide resolved
src/tests/build.proj Outdated Show resolved Hide resolved
src/tests/build.sh Outdated Show resolved Hide resolved
Comment on lines +86 to +113
# Export properties as environment variables for the MSBuild scripts to use
export __TestDir
export __TestIntermediatesDir
export __NativeTestIntermediatesDir
export __BinDir
export __TestBinDir
export __SkipManaged
export __SkipGenerateLayout
export __SkipTestWrappers
export __BuildTestProject
export __BuildTestDir
export __BuildTestTree
export __RuntimeFlavor
export __CopyNativeProjectsAfterCombinedTestBuild
export __CopyNativeTestBinaries
export __Priority
export __DoCrossgen2
export __CreatePerfMap
export __CompositeBuildMode
export __BuildTestWrappersOnly
export __GenerateLayoutOnly
export __TestBuildMode
export __MonoAot
export __MonoFullAot
export __MonoBinDir
export __MsgPrefix
export __ErrMsgPrefix
export __Exclude
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not a fan of how MSBuild implicitly imports the current environment as properties and I generally try to avoid it if possible. Additionally, if we ever are able to move to unifying the runtime test build script entry-point into the root build scripts, this will definitely need to be cleaned up for that. This can wait until later work though as the environment variables are easily identifiable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if you don't mind, I'd prefer to leave that to a separate follow-up change. The main problem is that escaping will be necessary at least for some of the properties (in particular I hit this with the __MsgPrefix and __ErrMsgPrefix variables) and that will require a bit more experimentation; moreover the escaping will be probably done in slightly different ways in the cmd/sh version of the script.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can save this for later work.

@@ -1,6 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="12.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Import Project="Directory.Build.props" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it looks like run.proj doesn't import the NoTargets SDK, it won't import the Directory.Build.props/targets files. If that's intentional, it might be worthwhile to move into a subdirectory and use the NoTargets SDK + empty Directory.Build.props/targets files to make it more easily understood at a glance. (I know I've sunk a few hours here and there trying to figure out why Directory.Build.props/targets wasn't being imported when I think it should have been or vice versa).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been actually thinking about merging the run.proj script into the build.proj script; with this change it's no longer called directly, build.proj is always the "front-end"; after all, even with my additions build.proj has less than 200 lines and in combination with run.proj it will have just about 1000 lines. But I would also prefer to do that separately to make git history cleaner, merging the scripts will be a purely mechanical change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure we can hold off on that until later.

src/tests/run.proj Outdated Show resolved Hide resolved
This change addresses the simpler aspects of Jeremy's feedback
(organization, naming, documentation). I'll address the CallTarget
comment in a follow-up commit. For the environment variables,
I would indeed prefer to address that separately in a follow-up
change as Jeremy himself suggested in his PR feedback.

Thanks

Tomas
@@ -121,7 +121,7 @@ build_Tests()
buildArgs+=("/p:NUMBER_OF_PROCESSORS=${__NumProc}")
buildArgs+=("${__UnprocessedBuildArgs[@]}")

# Disable warnAsError - coreclr issue 19922
# Disable warnAsError - https://github.com/dotnet/runtime/issues/11077
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this issue has been closed, it's worth taking a look and seeing if we can remove the warnAsError flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in the latest commit, let's see what the lab says.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted per your PR comment, I'm rerunning the outerloop job.

Copy link
Member

@jkoritzinsky jkoritzinsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM barring CI failures. Looks like there might be some cases that still need -warnAsError false in the Pri1 test build, so we can undo that change if needed to get CI more green.

@trylek
Copy link
Member Author

trylek commented Oct 5, 2021

Aah, good point, thanks Jeremy, I've just been tearing my (remaining) hair out over the weird five failures in the Pri1 build. I'll try to put back the warnAsError clause for now and I'll add a comment to the issue detailing the five or so failing tests.

@trylek trylek merged commit be91858 into dotnet:main Oct 6, 2021
@trylek trylek deleted the TestBuildCleanup branch October 6, 2021 19:04
@ghost ghost locked as resolved and limited conversation to collaborators Nov 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants