-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application crash from finalizer in AndroidHttpHandler when disposing #4550
Comments
Here's a custom handler we came up with that seems to address both Dispose crashes mentioned above, as well as a "Socket closed" exception that happens on some request. This is all really ugly, but seems to do the trick for all our tests. Not code I love to have in my code-base though: private class FixedAndroidClientHandler : Xamarin.Android.Net.AndroidClientHandler
{
protected override Task SetupRequest(HttpRequestMessage request, Java.Net.HttpURLConnection conn)
=> Task.CompletedTask;
protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
=> SendAsync(request, cancellationToken, false);
private async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken, bool isRetry)
{
try
{
return await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
}
catch (Java.IO.IOException ex)
{
if (ex.Message.Contains("Canceled"))
{
throw new OperationCanceledException("The operation was canceled.", ex);
}
else
{
if (!isRetry && ex is Java.Net.SocketException && ex.Message.Contains("Socket closed"))
{
// During socket-close we'll often succeed if we retry, so retry once
return await SendAsync(request, cancellationToken, true).ConfigureAwait(false);
}
throw new HttpRequestException(ex.Message, ex); // match behavior of other handlers
}
}
}
} |
There are two separate issues reported here. I'll address them both in separate comments to keep them clearly apart. First, let's tackle the original crash, the The exception is not an accident. It is thrown by the The TLDR reason why it's unobserved is simple - the code in OP does not wait for the I instrumented both the sample code as well as
The instrumented |
The second issue is wrapping Java exceptions in the .NET ones. This was addressed in this commit - please give it a try and see if that works for you. If it doesn't, please file a separate issue, thanks! |
It is not correct that things must be awaited on. I have a continuation block that runs instead. When things dispose, requests should be cancelled. But the internals has severa; Task.Run calls that doesn't check if things were disposed while posting the operation to a different thread. As also mentioned in the second callstack (which I don't have a simple repro for), requests has actually been cancelled before disposing. The second issue with wrapping exceptions is irrelevant here. My workaround handler technically works around 4 issues just to get some resemblance of stability that matches other platforms. However as I build a library and users are free to configure which ever handler they want to use, I can't really use these workarounds in production. |
@dotMorten The continuation does not handle the exceptions for you. The exceptions have to be observed. And the requests are disposed - if you look at the code you will see that we cancel the Java native request once it is set up in a cancellation handler. The exception in OP happens before we set up any wrappers or create any Java connections. |
Why are no other handlers exhibiting this behavior then? We literally run 1000s of tests on .net framework, .net core, uwp, iOS and Android. Android has historically been running way worse than the others but we finally traced it down to this bit where cancellations and dispose aren't handled right. |
Because other handlers don't call and manage an external native code process which does all the job. They're self-contained, |
And the important part is - the code which starts the |
If what you're saying is true, then this should hit the UnobservedTaskException, but it is not: Random r = new Random();
for (int i = 0; i < 100; i++)
{
var _ = Task.Run(async () =>
{
await Task.Delay(10);
if (r.Next(5) == 0)
throw new Exception();
});
} |
That's not true. Most of them rely on an external platform-native handler, like WinInet, NSUrlSession etc. Regardless it's not really a reason why it needs to be more unstable than the others. Btw I'm trying to set up a proper repro that reproduces the other scenario where in fact I AM awaiting the results, but the multi-threaded behavior is all little tricky to get right to reproduce correctly. Just to reiterate. The repro above is simplified. I AM awaiting results in my real app. |
Here's some doc on using Task discards. The example even specifically talks about ignoring the thrown exception inside the task: |
This code doesn't involve objects that are disposed owning the pipeline that was started - there are no finalizers involved either. |
Once again, no objects which are involved in this code are disposed of. The sample from OP runs the TPL pipeline inside an object that is disposed in a racy manner. |
It appears that some versions of Mono don't discard the exception properly with the |
Still struggling to get a proper repro outside. There's just too much going on on too many threads, but I have verified I have absolutely no async-voids without proper try/catch and I have no tasks that aren't getting awaited (even the continue-withs have awaits outside them). I have narrowed it down to requests getting canceled though. Also I don't see any discard use in that code-path. I've also at this point beyond doubt verified that this simple change to the handler will fix the issue. All it does is changing the exception type. private class NativeHttpMessageHandler : Xamarin.Android.Net.AndroidClientHandler
{
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
try
{
return await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
}
catch (Java.IO.IOException ex)
{
if (ex.Message.Contains("Canceled"))
{
throw new OperationCanceledException("The operation was canceled.", ex);
}
throw;
}
catch (System.Exception ex)
{
throw;
}
}
} The mere fact that I change the exception type shouldn't change other callers being observed. I'm starting to think the exception type makes all the difference. If I just catch and re-throw (or throw something else) the problem continues. |
@grendello It does indeed looks like cancellation exceptions are handled specifically: That might explain why other handlers doesn't throw here, because they throw the (correct?) exception |
While it's a good find (and Mono uses similar code), it is not something we can use (and I don't think other handlers behave this way either). We do throw the cancellation exception with Regarding the other handlers - I don't think they deal with an API that throws external exceptions like the Android Java API does. Especially on macOS, IIRC, the cancellation behavior is well controlled with protocols, I imagine it's similar with Windows APIs. The Java API wrapped by Let me reiterate again - there's no escape from handling pipeline exceptions in the caller code. The pipeline (and all the code it runs) is free (as it should be) to throw any exceptions it wants, especially that there's no way we can control any code called from within the pipeline tasks (it can call 3rd party libraries etc). |
Spent days trouble-shooting this one. I ended up pulling all the HTTP code into mine to instrument it, as we just couldn't find anything that appeared to be unhandled. Finally tracked the unhandled exception down. It was incredibly subtle and actually only happens when a cancellation token had been cancelled. So yes you were totally right. I feel like an idiot :-) However... I did find an issue while instrumenting the android handler, and was sort of the reason why we aren't getting a cancellation exception as we should have (which would have been incorrectly swallowed and we'd been none-the-wiser). What I found was that first this Task.Run would initiate: Now the funny part is, if I added a check for cancellation right before getting the status, it would always go past that, and in true race-condition mania get disposed right as the response code is getting pulled. I have a feeling the reason for that race condition being so reproducible was because it was all happening during GC. So the best fix I found was to add a catch right before finally with: catch
{
cancellationToken.ThrowIfCancellationRequested()
throw;
} So I do think there's a genuine bug here and this issue should be reopened. (it's just not an unhandled exception, but the wrong exception being thrown). |
There's no need to feel this way :) Threading is tricky on its own and full of surprises. It gets worse, IMO, with APIs which aim to hide the complexity behind elaborate machinery. They work 99% of time, but when that 1% fails you're left staring at the screen and wondering what went wrong as it can be any of a thousand things either in the said machinery or in your code. That leads to arduous process of discovery, as you've experienced yourself.
While ugly, the code above is perfectly valid. However, I think what we might do instead in
Yes, it's because |
Context: dotnet#4550 It is possible that, in certain situations, a cancellation request arriving from code external to `AndroidClientHandler` will not be satisfied while starting the task to acquire connection status, because the cancellation request appears after handler already established connection to the remote host but before the request status is obtained. Failure to check that the cancellation was requested may result in the status retrieval task starting only to run into the issue that the `AndroidClientHandler` object itself is already disposed (since the cancellation request has been applied) and throw the `ObjectDisposedException` instead of cancelling the status request quietly. Note that this is likely happen only in situations when the application code aggressively cancels and collects (via calls to the `GC` class) `AndroidClientHandler` instances or in other situations which induce potential for timing issues in the TPL or `async/await` state machine, thus leading to possible race conditions there. This commit passes cancellation token to the task which runs the status request, thus preventing it from starting in the situation described above. The possibility of a race condition still remains, but is seriously reduced.
@dotMorten please see if PR #4589 does the trick for you. The reason why I don't want to add the |
@grendello You change would make sense, but that doesn't work for me. But as I mentioned the disconnect happens right between the task getting started and getting the response code.
Here's what I'm seeing in the output:
As you can see it actually enters the start of the task, but then crashes getting the response code. I'm starting to understand why that seemingly impossible race condition is so reproducible: We request to complete once headers has been received, not when the full response has been received. We do this because some responses are very large, and we don't want to load everything into memory first but instead parse the data as it trickles in. The problem does indeed go away if I were to not do that. So I'm guessing that getting the status code takes a long time, because it doesn't actually complete until the entire stream has been read? ( |
@dotMorten I'll address your comment a bit later, but for now just one question - are you using the |
no this is a GET request, with completion-option |
@dotMorten is there a reason you're not using |
@grendello HEAD is if you only want the header. I still want the full response, but want to get control back as soon as the headers has been received, so I can start processing the response as it trickles in. Sometimes the responses can be several megabytes (up to several hundreds), and I don't want that read into memory before starting to read through the contents, or dumping it to disk. That way I only ever allocate more or less however large my buffer size is, rather than the entire response. |
@dotMorten what's wrong with:
You don't have to disconnect to issue more than one requests with HTTP 1.1+ - just keep the connection alive, and do what I did above. |
Huh? Why would I make two requests? I'm not testing whether the request will go through, I'm avoiding allocating the entire response. I think you're misunderstanding what the HttpCompletionOption is for in the HttpClient. @stevejgordon wrote a good blogpost recently on it: https://www.stevejgordon.co.uk/using-httpcompletionoption-responseheadersread-to-improve-httpclient-performance-dotnet |
|
Errr say what? No. A HEAD request will only return the headers from the server, and not the contents. I mean to do a GET request and actually get the contents. For sake of (an extreme) argument, let's say you want to make a GET request to a 10gb file: If you don't use the HeadersRead HttpCompletionOption, your app will run out of memory before the GET request even returns. Doing a HEAD request first would never get around that. You'd still have to do the GET request (and wasted a HEAD request) and end up with trying to load it into memory, which would fail. Another more reasonable scenario is providing download-progress: You'd report progress as the bytes trickle in from the stream. If you don't use the headersread completion option, you wouldn't even get the stream until the entire download has completed and loaded into memory. |
I think you keep missing the point. I see you understand very well how
Let me stress that the request is NOT handled in managed code with |
Complete the TASK not the Request itself. I want the task to return once the header has been received. Whether I cancel has nothing to do with the header. I might just not care about the response any longer half way through downloading it (cancel download, leave page etc). If it doesn't complete the task until after the whole request has been downloaded into memory we have a huge issue here. IMHO that needs to be fixed, as there's no way to do progress reporting or download large files then. |
@dotMorten I'm sorry, I don't have anything else to add. What I wrote above in all of my comments still stands. If you feel there's another issue with the code, please open another issue and describe the problem there. Thanks. |
Sure. Your comments are fine but based on a misunderstanding what I'm trying to achieve and how the HttpMessageHandlers are supposed to work with the completion option. At no point did I say I need to do a HEAD request but you keep insisting I should use that. Of course even the android handler can be made to do this even if it's java. You'd just have to create a custom stream object that wraps the connection and cuts that connection once the stream disposes. I'll open an issue describing how the handler is not following the contract if it truly does what you say it does. |
Context: dotnet#4550 It is possible that, in certain situations, a cancellation request arriving from code external to `AndroidClientHandler` will not be satisfied while starting the task to acquire connection status, because the cancellation request appears after handler already established connection to the remote host but before the request status is obtained. Failure to check that the cancellation was requested may result in the status retrieval task starting only to run into the issue that the `AndroidClientHandler` object itself is already disposed (since the cancellation request has been applied) and throw the `ObjectDisposedException` instead of cancelling the status request quietly. Note that this is likely happen only in situations when the application code aggressively cancels and collects (via calls to the `GC` class) `AndroidClientHandler` instances or in other situations which induce potential for timing issues in the TPL or `async/await` state machine, thus leading to possible race conditions there. This commit passes cancellation token to the task which runs the status request, thus preventing it from starting in the situation described above. The possibility of a race condition still remains, but is seriously reduced.
Context: dotnet#4550 It is possible that, in certain situations, a cancellation request arriving from code external to `AndroidClientHandler` will not be satisfied while starting the task to acquire connection status, because the cancellation request appears after handler already established connection to the remote host but before the request status is obtained. Failure to check that the cancellation was requested may result in the status retrieval task starting only to run into the issue that the `AndroidClientHandler` object itself is already disposed (since the cancellation request has been applied) and throw the `ObjectDisposedException` instead of cancelling the status request quietly. Note that this is likely happen only in situations when the application code aggressively cancels and collects (via calls to the `GC` class) `AndroidClientHandler` instances or in other situations which induce potential for timing issues in the TPL or `async/await` state machine, thus leading to possible race conditions there. This commit passes cancellation token to the task which runs the status request, thus preventing it from starting in the situation described above. The possibility of a race condition still remains, but is seriously reduced.
…4589) Context: #4550 It is possible that, in certain situations, a cancellation request arriving from code external to `AndroidClientHandler` will not be satisfied while starting the task to acquire connection status, because the cancellation request appears after handler already established connection to the remote host but before the request status is obtained. Failure to check that the cancellation was requested may result in the status retrieval task starting only to run into the issue that the `AndroidClientHandler` object itself is already disposed (since the cancellation request has been applied) and throw an `ObjectDisposedException` instead of cancelling the status request quietly. Note that this is likely happen only in situations when the application code aggressively cancels and collects `AndroidClientHandler` instances (e.g. via calls to `GC.Collect()`) or in other situations which induce potential for timing issues in the TPL or `async/await` state machine, thus leading to possible race conditions there. Pass the cancellation token to the task which runs the status request, thus preventing it from starting in the situation described above. The possibility of a race condition still remains, but is reduced.
…4589) Context: #4550 It is possible that, in certain situations, a cancellation request arriving from code external to `AndroidClientHandler` will not be satisfied while starting the task to acquire connection status, because the cancellation request appears after handler already established connection to the remote host but before the request status is obtained. Failure to check that the cancellation was requested may result in the status retrieval task starting only to run into the issue that the `AndroidClientHandler` object itself is already disposed (since the cancellation request has been applied) and throw an `ObjectDisposedException` instead of cancelling the status request quietly. Note that this is likely happen only in situations when the application code aggressively cancels and collects `AndroidClientHandler` instances (e.g. via calls to `GC.Collect()`) or in other situations which induce potential for timing issues in the TPL or `async/await` state machine, thus leading to possible race conditions there. Pass the cancellation token to the task which runs the status request, thus preventing it from starting in the situation described above. The possibility of a race condition still remains, but is reduced.
I appears to me that the
AndroidClientHandler
does not appreciate being disposed while a request is in progress / getting cancelled. This can lead to application crashes.Steps to Reproduce
Run the following code and see the unobserved task exception is hit (even though it's being observed in the continuation method:
This'll throw
In my specific scenario I'm seeing an exception being thrown in the StatusCode getter:
The get statuscode actually lists this as a property than can throw an IOException.
The line of code is also especially suspicious as it does a Task.Run but doesn't consider things could get disposed as that is being posted:
https://github.com/xamarin/xamarin-android/blob/d48cf04f9749664bf48fc16bcb920d5d941cccab/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs#L426
Expected Behavior
No crash from the finalizer
Actual Behavior
Crash :)
Version Information
Microsoft Visual Studio Enterprise 2019
Version 16.5.3
VisualStudio.16.Release/16.5.3+30002.166
Microsoft .NET Framework
Version 4.8.03752
Installed Version: Enterprise
Visual C++ 2019 00435-60000-00000-AA944
Microsoft Visual C++ 2019
ArcGIS Runtime SDK for .NET 100.7.0
ArcGIS Runtime SDK for .NET allows developers to build immersive, native mapping applications for Windows, Android, and iOS devices using C#. It includes five APIs: WPF to create apps for Windows Desktop, UWP to create Universal Windows apps, Xamarin.Android and Xamarin.iOS for Android and iOS apps that need access to native functionality, and Xamarin.Forms to create apps that share UI layouts across Android, iOS, and UWP.
ASP.NET and Web Tools 2019 16.5.236.49856
ASP.NET and Web Tools 2019
ASP.NET Web Frameworks and Tools 2019 16.5.236.49856
For additional information, visit https://www.asp.net/
Azure App Service Tools v3.0.0 16.5.236.49856
Azure App Service Tools v3.0.0
Azure Functions and Web Jobs Tools 16.5.236.49856
Azure Functions and Web Jobs Tools
C# Tools 3.5.0-beta4-20153-05+20b9af913f1b8ce0a62f72bea9e75e4aa3cf6b0e
C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.
Common Azure Tools 1.10
Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.
Extensibility Message Bus 1.2.0 (d16-2@8b56e20)
Provides common messaging-based MEF services for loosely coupled Visual Studio extension components communication and integration.
IntelliCode Extension 1.0
IntelliCode Visual Studio Extension Detailed Info
Microsoft Azure Tools 2.9
Microsoft Azure Tools for Microsoft Visual Studio 2019 - v2.9.30207.1
Microsoft Continuous Delivery Tools for Visual Studio 0.4
Simplifying the configuration of Azure DevOps pipelines from within the Visual Studio IDE.
Microsoft JVM Debugger 1.0
Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines
Microsoft Library Manager 2.1.25+gdacdb9b7a1
Install client-side libraries easily to any web project
Microsoft MI-Based Debugger 1.0
Provides support for connecting Visual Studio to MI compatible debuggers
Microsoft Visual C++ Wizards 1.0
Microsoft Visual C++ Wizards
Microsoft Visual Studio Tools for Containers 1.1
Develop, run, validate your ASP.NET Core applications in the target environment. F5 your application directly into a container with debugging, or CTRL + F5 to edit & refresh your app without having to rebuild the container.
Microsoft Visual Studio VC Package 1.0
Microsoft Visual Studio VC Package
Mono Debugging for Visual Studio 16.5.514 (c4f36a9)
Support for debugging Mono processes with Visual Studio.
NuGet Package Manager 5.5.0
NuGet Package Manager in Visual Studio. For more information about NuGet, visit https://docs.nuget.org/
ProjectServicesPackage Extension 1.0
ProjectServicesPackage Visual Studio Extension Detailed Info
Snapshot Debugging Extension 1.0
Snapshot Debugging Visual Studio Extension Detailed Info
SQL Server Data Tools 16.0.62003.05170
Microsoft SQL Server Data Tools
Test Adapter for Boost.Test 1.0
Enables Visual Studio's testing tools with unit tests written for Boost.Test. The use terms and Third Party Notices are available in the extension installation directory.
Test Adapter for Google Test 1.0
Enables Visual Studio's testing tools with unit tests written for Google Test. The use terms and Third Party Notices are available in the extension installation directory.
TypeScript Tools 16.0.20225.2001
TypeScript Tools for Microsoft Visual Studio
UnoPlatformPackage Extension 1.0
UnoPlatformPackage Visual Studio Extension Detailed Info
Visual Basic Tools 3.5.0-beta4-20153-05+20b9af913f1b8ce0a62f72bea9e75e4aa3cf6b0e
Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.
Visual F# Tools 10.8.0.0 for F# 4.7 16.5.0-beta.20181.6+85af456066acd4e76d2bc7821b44a325e46f2fca
Microsoft Visual F# Tools 10.8.0.0 for F# 4.7
Visual Studio Code Debug Adapter Host Package 1.0
Interop layer for hosting Visual Studio Code debug adapters in Visual Studio
Visual Studio Container Tools Extensions (Preview) 1.0
View, manage, and diagnose containers within Visual Studio.
Visual Studio Tools for CMake 1.0
Visual Studio Tools for CMake
Visual Studio Tools for Containers 1.0
Visual Studio Tools for Containers
VisualStudio.DeviceLog 1.0
Information about my package
VisualStudio.Foo 1.0
Information about my package
VisualStudio.Mac 1.0
Mac Extension for Visual Studio
Xamarin 16.5.000.533 (d16-5@9152e1b)
Visual Studio extension to enable development for Xamarin.iOS and Xamarin.Android.
Xamarin Designer 16.5.0.470 (remotes/origin/d16-5@681de3fd6)
Visual Studio extension to enable Xamarin Designer tools in Visual Studio.
Xamarin Templates 16.5.49 (0904f41)
Templates for building iOS, Android, and Windows apps with Xamarin and Xamarin.Forms.
Xamarin.Android SDK 10.2.0.100 (d16-5/988c811)
Xamarin.Android Reference Assemblies and MSBuild support.
Mono: c0c5c78
Java.Interop: xamarin/java.interop/d16-5@fc18c54
ProGuard: xamarin/proguard@905836d
SQLite: xamarin/sqlite@46204c4
Xamarin.Android Tools: xamarin/xamarin-android-tools/d16-5@9f4ed4b
Xamarin.iOS and Xamarin.Mac SDK 13.16.0.13 (b75deaf)
Xamarin.iOS and Xamarin.Mac Reference Assemblies and MSBuild support.
The text was updated successfully, but these errors were encountered: