Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[android] .NET 9 p6, trimmer step no longer saving changes to disk #103987

Closed
jonathanpeppers opened this issue Jun 25, 2024 · 14 comments · Fixed by #104267
Closed

[android] .NET 9 p6, trimmer step no longer saving changes to disk #103987

jonathanpeppers opened this issue Jun 25, 2024 · 14 comments · Fixed by #104267
Assignees
Labels
area-Tools-ILLink .NET linker development as well as trimming analyzers
Milestone

Comments

@jonathanpeppers
Copy link
Member

Description

We noticed this happening on both of these PRs:

We have a test failure:

Xamarin.Android.JcwGen_Tests, Xamarin.Android.JcwGenTests.BindingTests.JavaAbstractMethodTest / Release
Error message
System.TypeLoadException : VTable setup of type Library.MyClrCursor failed
Stack trace
   at System.Reflection.MethodBaseInvoker.InterpretedInvoke_Method(Object obj, IntPtr* args)
   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object , BindingFlags )

This test is verifying that this trimmer step works:

What the trimmer step does, is attempt to "fill in" new abstract members of new Android APIs. This would allow existing net8.0-android NuGet packages to work in .NET 9, when Android adds new abstract members. (unfortunately Google does!). The general idea, is we'd prevent an exception unless the new member was actually called.

It seems like our trimmer step is doing its work:

ILLink: Added method: System.Void Test.Bindings.ICursor::NewMethod() to type: Library.MyClrCursor scope: Xamarin.Android.FixJavaAbstractMethod-Library.dll
ILLink: Added method: System.Int32 Test.Bindings.ICursor::NewMethodWithParams(System.Int32,System.String) to type: Library.MyClrCursor scope: Xamarin.Android.FixJavaAbstractMethod-Library.dll
ILLink: Added method: System.Int32 Test.Bindings.ICursor::NewMethodWithParams(System.Int32,System.String,System.Single) to type: Library.MyClrCursor scope: Xamarin.Android.FixJavaAbstractMethod-Library.dll
ILLink: Added method: System.Int32 Test.Bindings.ICursor::NewMethodWithRT() to type: Library.MyClrCursor scope: Xamarin.Android.FixJavaAbstractMethod-Library.dll

But we don't see the changes in illink's output.

Reproduction Steps

  • Build .NET for Android
  • Run test project
.\dotnet-local.cmd build tests\CodeGen-Binding\Xamarin.Android.JcwGen-Tests\Xamarin.Android.JcwGen-Tests.csproj -c Release -t:Install,RunTestApks --p:_ExtraTrimmerArgs=--verbose

Expected behavior

JavaAbstractMethodTest passes

Actual behavior

JavaAbstractMethodTest fails with TypeLoadException

Regression?

Yes, this worked in .NET 9 Preview 5

Known Workarounds

None

Configuration

Runtime: 9.0.0-preview.6.24321.8 or newer

Other information

.binlog: illink-verbose.zip

Output assemblies: linked.zip

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jun 25, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jun 25, 2024
@jonathanpeppers jonathanpeppers added the area-Tools-ILLink .NET linker development as well as trimming analyzers label Jun 25, 2024
Copy link
Contributor

Tagging subscribers to this area: @agocke, @sbomer, @vitek-karas
See info in area-owners.md if you want to be subscribed.

jonathanpeppers added a commit to dotnet/android that referenced this issue Jun 25, 2024
jonathanpeppers added a commit to dotnet/android that referenced this issue Jun 25, 2024
jonathanpeppers added a commit to dotnet/android that referenced this issue Jun 25, 2024
Changes: dotnet/sdk@1741345...ea9243f
Changes: dotnet/runtime@84b3339...117cfcc
Changes: dotnet/emsdk@53288f8...9880d89
Changes: dotnet/cecil@7a4a59f...d145726

Updates:

* VS.Tools.Net.Core.SDK.Resolver: from 9.0.100-preview.5.24262.2 to 9.0.100-preview.7.24323.5
* Microsoft.NETCore.App.Ref: from 9.0.0-preview.5.24256.1 to 9.0.0-preview.6.24319.11
* Microsoft.NET.Workload.Emscripten.Current.Manifest-9.0.100.Transport: from 9.0.0-preview.5.24223.2 to 9.0.0-preview.6.24317.2
* Microsoft.NET.ILLink.Tasks: from 9.0.0-preview.5.24256.1 to 9.0.0-preview.6.24319.11
* Microsoft.DotNet.Cecil: from 0.11.4-alpha.24230.1 to 0.11.4-alpha.24313.1

~~ Other changes ~~

* `$(NuGetAudit)` triggers warnings in some tests:

Context: https://github.com/NuGet/docs.microsoft.com-nuget/blob/eed234f4b3edb7358e06cd2370828412a7dbd3f6/docs/concepts/Auditing-Packages.md#configuring-nuget-audit

    warning NU1902: Package 'Microsoft.AspNetCore.Components' 6.0.0 has a known moderate severity vulnerability, GHSA-3fx3-85r4-8j3w

Bump NuGet Packages Microsoft.AspNetCore.Components.WebView to 8.0.* to fix this.

* Update `*.apkdesc` files, as app size has changed slightly

* `Xamarin.Android.JcwGen_Tests` fails with:

    Xamarin.Android.JcwGen_Tests, Xamarin.Android.JcwGenTests.BindingTests.JavaAbstractMethodTest / Release
    Error message
    System.TypeLoadException : VTable setup of type Library.MyClrCursor failed
    Stack trace
       at System.Reflection.MethodBaseInvoker.InterpretedInvoke_Method(Object obj, IntPtr* args)
       at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object , BindingFlags )

Temporarily, ignored the test and filed:

* dotnet/runtime#103987

Co-authored-by: Jonathan Peppers <jonathan.peppers@microsoft.com>
Co-authored-by: Dean Ellis <dellis1972@googlemail.com>
jonathanpeppers added a commit to dotnet/android that referenced this issue Jun 25, 2024
Changes: dotnet/sdk@1741345...3c7b1bd
Changes: dotnet/runtime@84b3339...89a244e
Changes: dotnet/emsdk@53288f8...9880d89
Changes: dotnet/cecil@7a4a59f...d145726

Updates:

* VS.Tools.Net.Core.SDK.Resolver: from 9.0.100-preview.5.24262.2 to 9.0.100-preview.6.24324.9
* Microsoft.NETCore.App.Ref: from 9.0.0-preview.5.24256.1 to 9.0.0-preview.6.24321.8
* Microsoft.NET.Workload.Emscripten.Current.Manifest-9.0.100.Transport: from 9.0.0-preview.5.24223.2 to 9.0.0-preview.6.24319.1
* Microsoft.NET.ILLink.Tasks: from 9.0.0-preview.5.24256.1 to 9.0.0-preview.6.24321.8
* Microsoft.DotNet.Cecil: from 0.11.4-alpha.24230.1 to 0.11.4-alpha.24313.1

~~ Other changes ~~

* `$(NuGetAudit)` triggers warnings in some tests:

Context: https://github.com/NuGet/docs.microsoft.com-nuget/blob/eed234f4b3edb7358e06cd2370828412a7dbd3f6/docs/concepts/Auditing-Packages.md#configuring-nuget-audit

    warning NU1902: Package 'Microsoft.AspNetCore.Components' 6.0.0 has a known moderate severity vulnerability, GHSA-3fx3-85r4-8j3w

Bump NuGet Packages Microsoft.AspNetCore.Components.WebView to 8.0.* to fix this.

* Update `*.apkdesc` files, as app size has changed slightly

* `Xamarin.Android.JcwGen_Tests` fails with:

    Xamarin.Android.JcwGen_Tests, Xamarin.Android.JcwGenTests.BindingTests.JavaAbstractMethodTest / Release
    Error message
    System.TypeLoadException : VTable setup of type Library.MyClrCursor failed
    Stack trace
       at System.Reflection.MethodBaseInvoker.InterpretedInvoke_Method(Object obj, IntPtr* args)
       at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object , BindingFlags )

Temporarily, ignored the test and filed:

* dotnet/runtime#103987

Co-authored-by: Jonathan Peppers <jonathan.peppers@microsoft.com>
Co-authored-by: Dean Ellis <dellis1972@googlemail.com>
@agocke
Copy link
Member

agocke commented Jun 28, 2024

@sbomer

@agocke agocke added this to the 9.0.0 milestone Jun 28, 2024
@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Jun 28, 2024
@sbomer sbomer self-assigned this Jun 28, 2024
@vitek-karas
Copy link
Member

On debug build this asserts in the trimmer, but it looks unrelated to the problem:

Debug.Assert (preserve != TypePreserve.Nothing);

The call comes from:
https://github.com/dotnet/android/blob/06bb1dc6a292ef5618a3bb6ecca3ca869253ff2e/src/Xamarin.Android.Build.Tasks/Linker/MonoDroid.Tuner/FixAbstractMethodsStep.cs#L346

This assert has been there for 3 years now - added in dotnet/linker#1768
And the caller was updated to match the new APIs in dotnet/android#5748 - apparently not correctly :-)

So this is probably a long standing bug which doesn't seem to matter much - I didn't try to figure out what the assert guards against and thus what is the problem (been a while).

@sbomer - probably worth filing a bug/PR in dotnet/android for this.

@vitek-karas
Copy link
Member

vitek-karas commented Jun 28, 2024

(This looks to be a bit of a bug farm :-))
For context - The above asserts is happening around type Java.Lang.AbstractMethodError
Ignoring the above assert leads to another assert, this time in:

Debug.Assert (GetPreservedMethods (definition) == null);

The method in question is Java.Lang.ICharSequenceInvoker..ctor - so it's very likely unrelated to the previous assert.
The caller is (a bit indirectly):
https://github.com/dotnet/android/blob/06bb1dc6a292ef5618a3bb6ecca3ca869253ff2e/src/Microsoft.Android.Sdk.ILLink/MarkJavaObjects.cs#L189

and then

https://github.com/dotnet/android/blob/06bb1dc6a292ef5618a3bb6ecca3ca869253ff2e/src/Microsoft.Android.Sdk.ILLink/MarkJavaObjects.cs#L147

I vaguely remember something around the AddPreserveMethod and the complexity behind it. This feels like an internal illink problem which is triggered by the custom step. So either unexpected call sequence from the custom step or some other problem.

Another very likely unrelated bug to the original issue here.

This second assert is hit many times over on various other methods, in all cases the class name seems to include the word Invoker - but I don't know if that makes it special somehow.

@vitek-karas
Copy link
Member

The actual problem in this case is because the added method is not marked so SweepStep removes it.

I looked at this:

ILLink: Added method: System.Void Test.Bindings.ICursor::NewMethod() to type: Library.MyClrCursor scope: Xamarin.Android.FixJavaAbstractMethod-Library.dll

The part I don't understand is that this behavior has been like this since forever. It's possible this has something to do with interface overrides as the added method is meant to implement ICursor.NewMethod which seems to be preserved in the output (it's part of a different assembly). So it's possible that before the more recent changes to better implement interface marking this method ended up being marked by some other logic in the linker, but now it's not.

This looks really weird - the output is "corrupted" I think:
image

image

As you can see the MyClrCursor class doesn't implement the method NewMethod on the interface ICursor even though it does implement that interface.

It's possible that this is caused by the fact that the method is added by a custom step, since the addition happens during MarkType and it might be "Too late" for some of the internal caches in the trimmer - but really I'm just guessing, unfortunately I lost track of how interface marking works in the trimmer after the recent changes.

I think this will require either @jtschuster or @sbomer to take a deeper look.

I'll send you the repro offline.

@vitek-karas
Copy link
Member

On a related note:
@jonathanpeppers it might be a good idea to try to see if we could run a debug trimmer in one of the CI lanes in the dotnet/android repo - as apparently the custom steps are not following the expected behavior :-)
But I don't know if we publish debug packages on the transfer feeds.

@jtschuster
Copy link
Member

#103140 could be relevant here.

@sbomer
Copy link
Member

sbomer commented Jun 29, 2024

Thank you for the repro and notes @vitek-karas!

fbb67a6 introduced the specific regression. Similar to #103115 (comment), this looks like a bad interaction with custom steps and potentially with the interface logic.

The custom step is relying on getting called before we build the type info for types it touches. This isn't guaranteed, and I am surprised it didn't cause problems earlier. Here's the sequence of events in the test failure:

Before fbb67a6, while processing JavaAbstractMethodTest, the call to Library.MyClrCursor () would call MarkType for MyClrCursor. This called the custom step immediately (instead of delaying until after we process the reflection-accessed method), so it happened that the custom step got a chance to process the type before we called into MapTypeInfo for the assembly. But I think this is a latent issue that just happened to show up now, since we don't guarantee the custom step gets called before building type info.

If we want to make this guarantee we might need to build the type info cache per type, instead of per assembly, and ensure it never gets built for a type before ProcessType calls the custom step.

@vcsjones vcsjones removed the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Jul 1, 2024
@jonathanpeppers
Copy link
Member Author

jonathanpeppers commented Jul 1, 2024

On a related note: @jonathanpeppers it might be a good idea to try to see if we could run a debug trimmer in one of the CI lanes in the dotnet/android repo - as apparently the custom steps are not following the expected behavior :-) But I don't know if we publish debug packages on the transfer feeds.

If someone can share some instructions, I could try this, thanks. We just use dotnet-install.sh/ps1 to provision .NET, so I would expect we'd need some MSBuild logic to swap ILLink packs.

It does seem like it would be useful for part of our CI to fail, if one of those Debug.Assert() calls are hit.

@agocke
Copy link
Member

agocke commented Jul 1, 2024

But I don't know if we publish debug packages on the transfer feeds.

Unfortunately we don't. It might be tricky to enable this. We do sometimes pass around non-shipping bits in transport packages, but building in multiple configurations can get hairy, and you have to be very careful not to depend on the non-shipping bits in any shipping configuration.

It might be easier to light up these asserts based on some environment variable.

@sbomer sbomer closed this as completed in c5f2d4f Jul 2, 2024
jonathanpeppers added a commit to jonathanpeppers/xamarin-android that referenced this issue Jul 11, 2024
Context: dotnet/runtime#103987

dotnet/runtime#103987 was fixed, and we should have the changes in 094bf61.

Let's enable the test again.
jonathanpeppers added a commit to dotnet/android that referenced this issue Jul 15, 2024
Context: dotnet/runtime#103987

dotnet/runtime#103987 was fixed, and we should have the changes in 094bf61.

Let's enable the test again.
@github-actions github-actions bot locked and limited conversation to collaborators Aug 2, 2024
@jtschuster
Copy link
Member

@jonathanpeppers I want to make sure I understand the problem that FixAbstractMethods solves. Would the problem also be fixed if the Android libraries created default interface methods that throw Java.Lang.AbstractMethodError (or if the trimmer adds an implementation into the interface instead of the implementing type)?

@jonathanpeppers
Copy link
Member Author

Yes, there isn't a good public link that describes it. Ancient commit history from Xamarin says:

[linker] fix abstract java methods

  • it adds FixAbstractMethods linker step, which looks for methods
    with unimplemented abstract methods in non-abstract classes,
    derived from Java.Lang.Object class, and fixes them by adding
    implementation, which throws AbstractMethodError exception
  • the newly added step is used in XA linker
  • the LinkAssemblies task is now used even for Debug configuration
    when LinkMode is set to LinkNone. very simple "linker" is build in
    this case, where the pipeline contains only FixAbstractMethods and
    Output steps. this task in Debug configuration takes about 80 to
    190 ms on my system, with test app

radekdoulik
committed 9 years ago

My understanding, it's something like:

  1. Android version X has an ICursor interface
  2. You implement it in C#, put it in a NuGet package
  3. Android version Y adds a new method to ICursor, so we add it to the Android workload for .NET

The problem here is C# will throw TypeLoadException very early if you try to use the existing C# class in no. 2. While for Java (JVM), they are less strict about it -- you would only get AbstractMethodError if you called the method, while C# it happens when the type is loaded.

So, the trimmer step goes through and adds the missing methods, and just makes them throw. But this gives us the same behavior as Java.

@agocke
Copy link
Member

agocke commented Aug 29, 2024

@sbomer and I discussed this a few weeks ago and I think the conclusion was that strictly adding default interface methods that throw wasn’t fully correct because you want the compiler to error if the user doesn’t implement all the methods. The specific case this is trying to protect against is when new interface methods were added later, in newer runtime assemblies.

An alternate fix would be to produce reference assemblies with no default implementations, and impl assemblies with default implementations.

Another fix would be a standalone tool that runs before the linker and does the same thing. There’s no dependency in linker behavior here.

@jonathanpeppers
Copy link
Member Author

I think the conclusion was that strictly adding default interface methods that throw wasn’t fully correct because you want the compiler to error if the user doesn’t implement all the methods.

Yes, they would get this behavior if a NuGet package author updated their major .NET version newer and recompiled. But the problem occurs for existing NuGet packages. You can still use net8.0-android class libraries in net9.0-android apps, for example.

Another fix would be a standalone tool that runs before the linker and does the same thing. There’s no dependency in linker behavior here.

For this part, we would need to do something like this for NativeAOT as well, as I think there is no custom trimmer step support.

What we have right now is the solution from almost a decade ago. This is something we could rework for .NET 10, if there is a good reason for removing this step.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Tools-ILLink .NET linker development as well as trimming analyzers
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants