Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application crash from finalizer in AndroidHttpHandler when disposing #4550

Closed
dotMorten opened this issue Apr 13, 2020 · 34 comments
Closed
Assignees
Labels
Area: App Runtime Issues in `libmonodroid.so`.

Comments

@dotMorten
Copy link
Contributor

I appears to me that the AndroidClientHandler does not appreciate being disposed while a request is in progress / getting cancelled. This can lead to application crashes.

Steps to Reproduce

Run the following code and see the unobserved task exception is hit (even though it's being observed in the continuation method:

        private async void TestRequests()
        {
            TaskScheduler.UnobservedTaskException += (object sender, UnobservedTaskExceptionEventArgs args) =>
            {
                // App will crash here if we don't handle it
                args.SetObserved();
                Debugger.Break();
            };
            for (int i = 0; i < 100; i++)
            {
                //using (var handler = new System.Net.Http.HttpClientHandler()) // Doesn't crash
                using (var handler = new AndroidClientHandler()) 
                {
                    using (HttpClient client = new HttpClient(handler))
                    {
                        using (var request = new HttpRequestMessage(HttpMethod.Get, "http://sampleserver3.arcgisonline.com/ArcGIS/rest/services/Earthquakes/EarthquakesFromLastSevenDays/MapServer/0/query?text=&geometry=&geometryType=esriGeometryPoint&inSR=&spatialRel=esriSpatialRelIntersects&relationParam=&objectIds=&where=1%3D1&time=&returnCountOnly=false&returnIdsOnly=false&returnGeometry=true&maxAllowableOffset=&outSR=&outFields=&f=pjson&t=" + DateTime.Now.Ticks.ToString()))
                        {
                            CancellationTokenSource tcs = new CancellationTokenSource();
                            tcs.CancelAfter(100);
                            try
                            {
                                _ = client.SendAsync(request, tcs.Token).ContinueWith(t =>
                                {
                                });
                            }
                            catch { }
                        }
                    }
                }
                GC.Collect();
                GC.WaitForPendingFinalizers();
            }
        }

This'll throw

A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Cannot access a disposed object.
Object name: 'AndroidClientHandler'.)
StackTrace

  at System.Threading.Tasks.Task.InnerInvoke () [0x0000f] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:2476 
  at System.Threading.Tasks.Task.Execute () [0x00000] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:2319 
--- End of stack trace from previous location where exception was thrown ---

  at Xamarin.Android.Net.AndroidClientHandler.SetupRequestInternal (System.Net.Http.HttpRequestMessage request, Java.Net.URLConnection conn) [0x0020f] in <4ccdb3137d974856b786e1aeebbfbab6>:0 
  at Xamarin.Android.Net.AndroidClientHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x00247] in <4ccdb3137d974856b786e1aeebbfbab6>:0 
  at System.Net.Http.HttpClient.FinishSendAsyncBuffered (System.Threading.Tasks.Task`1[TResult] sendTask, System.Net.Http.HttpRequestMessage request, System.Threading.CancellationTokenSource cts, System.Boolean disposeCts) [0x0017e] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corefx/src/System.Net.Http/src/System/Net/Http/HttpClient.cs:506 

In my specific scenario I'm seeing an exception being thrown in the StatusCode getter:

A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Canceled)
Inner Exception
Java.IO.IOException: Canceled
StackTrace:

    at Java.Interop.JniEnvironment+InstanceMethods.CallIntMethod (Java.Interop.JniObjectReference instance, Java.Interop.JniMethodInfo method, Java.Interop.JniArgumentValue* args) [0x0006e] in <26521a5118b44c858c385715922b9d5d>:0 
    at Java.Interop.JniPeerMembers+JniInstanceMethods.InvokeVirtualInt32Method (System.String encodedMember, Java.Interop.IJavaPeerable self, Java.Interop.JniArgumentValue* parameters) [0x0002a] in <26521a5118b44c858c385715922b9d5d>:0 
    at Java.Net.HttpURLConnection.get_ResponseCode () [0x00000] in /Users/runner/runners/2.165.0/work/1/s/xamarin-android/src/Mono.Android/obj/Release/android-28/mcw/Java.Net.HttpURLConnection.cs:505 
    at Xamarin.Android.Net.AndroidClientHandler+<>c__DisplayClass46_0.<DoProcessRequest>b__2 () [0x00000] in /Users/runner/runners/2.165.0/work/1/s/xamarin-android/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs:426 
    at System.Threading.Tasks.Task`1[TResult].InnerInvoke () [0x0000f] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs:534 
    at System.Threading.Tasks.Task.Execute () [0x00000] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:2319 
  --- End of stack trace from previous location where exception was thrown ---
  
    at Xamarin.Android.Net.AndroidClientHandler.DoProcessRequest (System.Net.Http.HttpRequestMessage request, Java.Net.URL javaUrl, Java.Net.HttpURLConnection httpConnection, System.Threading.CancellationToken cancellationToken, Xamarin.Android.Net.AndroidClientHandler+RequestRedirectionState redirectState) [0x00322] in /Users/runner/runners/2.165.0/work/1/s/xamarin-android/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs:426 
    at Xamarin.Android.Net.AndroidClientHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x00285] in /Users/runner/runners/2.165.0/work/1/s/xamarin-android/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs:286 

The get statuscode actually lists this as a property than can throw an IOException.
The line of code is also especially suspicious as it does a Task.Run but doesn't consider things could get disposed as that is being posted:
https://github.com/xamarin/xamarin-android/blob/d48cf04f9749664bf48fc16bcb920d5d941cccab/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs#L426

Expected Behavior

No crash from the finalizer

Actual Behavior

Crash :)

Version Information

Microsoft Visual Studio Enterprise 2019
Version 16.5.3
VisualStudio.16.Release/16.5.3+30002.166
Microsoft .NET Framework
Version 4.8.03752

Installed Version: Enterprise

Visual C++ 2019 00435-60000-00000-AA944
Microsoft Visual C++ 2019

ArcGIS Runtime SDK for .NET 100.7.0
ArcGIS Runtime SDK for .NET allows developers to build immersive, native mapping applications for Windows, Android, and iOS devices using C#. It includes five APIs: WPF to create apps for Windows Desktop, UWP to create Universal Windows apps, Xamarin.Android and Xamarin.iOS for Android and iOS apps that need access to native functionality, and Xamarin.Forms to create apps that share UI layouts across Android, iOS, and UWP.

ASP.NET and Web Tools 2019 16.5.236.49856
ASP.NET and Web Tools 2019

ASP.NET Web Frameworks and Tools 2019 16.5.236.49856
For additional information, visit https://www.asp.net/

Azure App Service Tools v3.0.0 16.5.236.49856
Azure App Service Tools v3.0.0

Azure Functions and Web Jobs Tools 16.5.236.49856
Azure Functions and Web Jobs Tools

C# Tools 3.5.0-beta4-20153-05+20b9af913f1b8ce0a62f72bea9e75e4aa3cf6b0e
C# components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Common Azure Tools 1.10
Provides common services for use by Azure Mobile Services and Microsoft Azure Tools.

Extensibility Message Bus 1.2.0 (d16-2@8b56e20)
Provides common messaging-based MEF services for loosely coupled Visual Studio extension components communication and integration.

IntelliCode Extension 1.0
IntelliCode Visual Studio Extension Detailed Info

Microsoft Azure Tools 2.9
Microsoft Azure Tools for Microsoft Visual Studio 2019 - v2.9.30207.1

Microsoft Continuous Delivery Tools for Visual Studio 0.4
Simplifying the configuration of Azure DevOps pipelines from within the Visual Studio IDE.

Microsoft JVM Debugger 1.0
Provides support for connecting the Visual Studio debugger to JDWP compatible Java Virtual Machines

Microsoft Library Manager 2.1.25+gdacdb9b7a1
Install client-side libraries easily to any web project

Microsoft MI-Based Debugger 1.0
Provides support for connecting Visual Studio to MI compatible debuggers

Microsoft Visual C++ Wizards 1.0
Microsoft Visual C++ Wizards

Microsoft Visual Studio Tools for Containers 1.1
Develop, run, validate your ASP.NET Core applications in the target environment. F5 your application directly into a container with debugging, or CTRL + F5 to edit & refresh your app without having to rebuild the container.

Microsoft Visual Studio VC Package 1.0
Microsoft Visual Studio VC Package

Mono Debugging for Visual Studio 16.5.514 (c4f36a9)
Support for debugging Mono processes with Visual Studio.

NuGet Package Manager 5.5.0
NuGet Package Manager in Visual Studio. For more information about NuGet, visit https://docs.nuget.org/

ProjectServicesPackage Extension 1.0
ProjectServicesPackage Visual Studio Extension Detailed Info

Snapshot Debugging Extension 1.0
Snapshot Debugging Visual Studio Extension Detailed Info

SQL Server Data Tools 16.0.62003.05170
Microsoft SQL Server Data Tools

Test Adapter for Boost.Test 1.0
Enables Visual Studio's testing tools with unit tests written for Boost.Test. The use terms and Third Party Notices are available in the extension installation directory.

Test Adapter for Google Test 1.0
Enables Visual Studio's testing tools with unit tests written for Google Test. The use terms and Third Party Notices are available in the extension installation directory.

TypeScript Tools 16.0.20225.2001
TypeScript Tools for Microsoft Visual Studio

UnoPlatformPackage Extension 1.0
UnoPlatformPackage Visual Studio Extension Detailed Info

Visual Basic Tools 3.5.0-beta4-20153-05+20b9af913f1b8ce0a62f72bea9e75e4aa3cf6b0e
Visual Basic components used in the IDE. Depending on your project type and settings, a different version of the compiler may be used.

Visual F# Tools 10.8.0.0 for F# 4.7 16.5.0-beta.20181.6+85af456066acd4e76d2bc7821b44a325e46f2fca
Microsoft Visual F# Tools 10.8.0.0 for F# 4.7

Visual Studio Code Debug Adapter Host Package 1.0
Interop layer for hosting Visual Studio Code debug adapters in Visual Studio

Visual Studio Container Tools Extensions (Preview) 1.0
View, manage, and diagnose containers within Visual Studio.

Visual Studio Tools for CMake 1.0
Visual Studio Tools for CMake

Visual Studio Tools for Containers 1.0
Visual Studio Tools for Containers

VisualStudio.DeviceLog 1.0
Information about my package

VisualStudio.Foo 1.0
Information about my package

VisualStudio.Mac 1.0
Mac Extension for Visual Studio

Xamarin 16.5.000.533 (d16-5@9152e1b)
Visual Studio extension to enable development for Xamarin.iOS and Xamarin.Android.

Xamarin Designer 16.5.0.470 (remotes/origin/d16-5@681de3fd6)
Visual Studio extension to enable Xamarin Designer tools in Visual Studio.

Xamarin Templates 16.5.49 (0904f41)
Templates for building iOS, Android, and Windows apps with Xamarin and Xamarin.Forms.

Xamarin.Android SDK 10.2.0.100 (d16-5/988c811)
Xamarin.Android Reference Assemblies and MSBuild support.
Mono: c0c5c78
Java.Interop: xamarin/java.interop/d16-5@fc18c54
ProGuard: xamarin/proguard@905836d
SQLite: xamarin/sqlite@46204c4
Xamarin.Android Tools: xamarin/xamarin-android-tools/d16-5@9f4ed4b

Xamarin.iOS and Xamarin.Mac SDK 13.16.0.13 (b75deaf)
Xamarin.iOS and Xamarin.Mac Reference Assemblies and MSBuild support.

@dotMorten dotMorten added the Area: App Runtime Issues in `libmonodroid.so`. label Apr 13, 2020
@jonathanpeppers jonathanpeppers added this to the Under Consideration milestone Apr 13, 2020
@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

Here's a custom handler we came up with that seems to address both Dispose crashes mentioned above, as well as a "Socket closed" exception that happens on some request. This is all really ugly, but seems to do the trick for all our tests. Not code I love to have in my code-base though:

private class FixedAndroidClientHandler : Xamarin.Android.Net.AndroidClientHandler
{
	protected override Task SetupRequest(HttpRequestMessage request, Java.Net.HttpURLConnection conn)
		=> Task.CompletedTask;

	protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
		=> SendAsync(request, cancellationToken, false);

	private async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken, bool isRetry)
	{
		try
		{
			return await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
		}
		catch (Java.IO.IOException ex)
		{
			if (ex.Message.Contains("Canceled"))
			{
				throw new OperationCanceledException("The operation was canceled.", ex);
			}
			else
			{
				if (!isRetry && ex is Java.Net.SocketException && ex.Message.Contains("Socket closed"))
				{
					// During socket-close we'll often succeed if we retry, so retry once
					return await SendAsync(request, cancellationToken, true).ConfigureAwait(false);
				}
				throw new HttpRequestException(ex.Message, ex); // match behavior of other handlers
			}
		}
	}
}

@grendello
Copy link
Contributor

There are two separate issues reported here. I'll address them both in separate comments to keep them clearly apart. First, let's tackle the original crash, the ObjectDisposedException one.

The exception is not an accident. It is thrown by the AssertSelf method which is called from several places in the AndroidClientHandler code but in this case we're dealing specifically with this call. This is an explicit and deliberate check whether the object we're running in is still valid and, as such, the check works properly - the exception should be thrown.

The TLDR reason why it's unobserved is simple - the code in OP does not wait for the async/await (TPL) pipeline to finish while there are several (4) separate tasks involved. If the ContinueWith call is followed by .Wait() or preceded by await, the exception will be correctly delivered within the call context of the task. Please consult the documentation articles here, here and here. Detailed explanation follows.

I instrumented both the sample code as well as AndroidClientHandler code to produce the following log:

04-14 13:37:25.007  6498  6498 I Issue4550: Running request: 56
04-14 13:37:25.007  6498  6498 W monodroid-assembly: typemap: unable to find mapping to a managed type from Java type 'com/android/okhttp/internal/huc/HttpURLConnectionImpl'
04-14 13:37:25.007  6498  6498 I monodroid-timing: Typemap.java_to_managed: end, total time; elapsed: 0s:0::33958
04-14 13:37:25.007  6498  6541 I Issue4550: [56]: in SendAsync/ContinueWith
04-14 13:37:25.008  6498  6498 I ACH     : Before await SetupRequest [56]
04-14 13:37:25.008  6498  6498 I Issue4550: [56] collect
04-14 13:37:25.011  6498  6538 I ACH     : AssertSelf[1 / 56]: I'm disposed!
04-14 13:37:25.015  6498  6498 I pconnectiontes: Explicit concurrent copying GC freed 308(89KB) AllocSpace objects, 0(0B) LOS objects, 55% free, 1247KB/2783KB, paused 17us total 4.301ms
04-14 13:37:25.015  6498  6498 I Issue4550: [56] finalizers
04-14 13:37:25.016  6498  6498 I Issue4550: Running request: 57
04-14 13:37:25.016  6498  6498 W monodroid-assembly: typemap: unable to find mapping to a managed type from Java type 'com/android/okhttp/internal/huc/HttpURLConnectionImpl'
04-14 13:37:25.016  6498  6498 I monodroid-timing: Typemap.java_to_managed: end, total time; elapsed: 0s:0::31979
04-14 13:37:25.016  6498  6534 I Issue4550: [57]: in SendAsync/ContinueWith
04-14 13:37:25.016  6498  6498 I ACH     : Before await SetupRequest [57]
04-14 13:37:25.016  6498  6498 I ACH     : After await SetupRequest [57]
04-14 13:37:25.017  6498  6498 I Issue4550: [57] collect
04-14 13:37:25.024  6498  6498 I pconnectiontes: Explicit concurrent copying GC freed 61(48KB) AllocSpace objects, 0(0B) LOS objects, 55% free, 1247KB/2783KB, paused 18us total 4.090ms
04-14 13:37:25.024  6498  6498 I Issue4550: [57] finalizers
04-14 13:37:25.025  6498  6546 I Issue4550: [57]: in SendAsync/ContinueWith
04-14 13:37:25.027  6498  6521 I Issue4550: Unobserved exception: System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Cannot access a disposed object.
04-14 13:37:25.027  6498  6521 I Issue4550: Object name: 'AndroidClientHandler[1 / 56]'.) ---> System.ObjectDisposedException: Cannot access a disposed object.
04-14 13:37:25.027  6498  6521 I Issue4550: Object name: 'AndroidClientHandler[1 / 56]'.
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.AssertSelf (System.Int32 pos, System.Int32 i) [0x0004c] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler+<>c__DisplayClass55_0.<SetupRequest>b__0 () [0x0001c] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Threading.Tasks.Task.InnerInvoke () [0x0000f] in <5675e4dc9621438ab1423cceabed9e16>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Threading.Tasks.Task.Execute () [0x00000] in <5675e4dc9621438ab1423cceabed9e16>:0 
04-14 13:37:25.027  6498  6521 I Issue4550: --- End of stack trace from previous location where exception was thrown ---
04-14 13:37:25.027  6498  6521 I Issue4550: 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.SetupRequestInternal (System.Net.Http.HttpRequestMessage request, Java.Net.URLConnection conn) [0x00334] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x00340] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Net.Http.HttpClient.FinishSendAsyncBuffered (System.Threading.Tasks.Task`1[TResult] sendTask, System.Net.Http.HttpRequestMessage request, System.Threading.CancellationTokenSource cts, System.Boolean disposeCts) [0x0017e] in <868ed2689364473cbedd047a250730ef
>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:    --- End of inner exception stack trace ---
04-14 13:37:25.027  6498  6521 I Issue4550: ---> (Inner Exception #0) System.ObjectDisposedException: Cannot access a disposed object.
04-14 13:37:25.027  6498  6521 I Issue4550: Object name: 'AndroidClientHandler[1 / 56]'.
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.AssertSelf (System.Int32 pos, System.Int32 i) [0x0004c] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler+<>c__DisplayClass55_0.<SetupRequest>b__0 () [0x0001c] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Threading.Tasks.Task.InnerInvoke () [0x0000f] in <5675e4dc9621438ab1423cceabed9e16>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Threading.Tasks.Task.Execute () [0x00000] in <5675e4dc9621438ab1423cceabed9e16>:0 
04-14 13:37:25.027  6498  6521 I Issue4550: --- End of stack trace from previous location where exception was thrown ---
04-14 13:37:25.027  6498  6521 I Issue4550: 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.SetupRequestInternal (System.Net.Http.HttpRequestMessage request, Java.Net.URLConnection conn) [0x00334] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at Xamarin.Android.Net.AndroidClientHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x00340] in <264be0fcbca545a59d6af42de0747f05>:0 
04-14 13:37:25.027  6498  6521 I Issue4550:   at System.Net.Http.HttpClient.FinishSendAsyncBuffered (System.Threading.Tasks.Task`1[TResult] sendTask, System.Net.Http.HttpRequestMessage request, System.Threading.CancellationTokenSource cts, System.Boolean disposeCts) [0x0017e] in <868ed2689364473cbedd047a250730ef
>:0 <---
04-14 13:37:25.027  6498  6498 I Issue4550: Running request: 58

The instrumented AssertSelf call shows its call site (1) is from the SetupRequest method and the number after the / character is the i value from the for loop inside the sample. Messages from the sample are tagged with Issue4550 while messages from AndroidClientHandler are tagged with ACH. Note that iteration 56 appears to complete properly before 57 starts, as expected, however it throws after 57 is finalized and the exception is thrown outside the outermost using block, around the call to GC.WaitForPendingFinalizers. This happens because neither the SendAsync nor the ContinueWith calls are waited upon. Both of them start a separate task with SendAsync also internally starting yet another task, to call the SetupRequest method. Since neither of the tasks created by the calls in the OP code is waited upon, the loop continues while the task pipeline for the object used in the current iteration continues in the backround. By the time the code gets to the GC calls, the previous iteration object (as we see in the log) may still be running the async/await pipeline, even though the object is already disposed (by exiting the using block) but not yet collected and which only now finishes and throws the exception which is, properly, propagated to the current execution context - outside the outermost using loop and thus the unobserved exception is thrown (also properly). Since we're dealing with tasks and threads, the very nature of them is that some of them will complete, some will not - thus creating a race condition we're seeing. The tasks created by either Task.Run or async/await must be waited upon in one place or another. It can be done by calling .Wait()/await on the task returned by SendAsync or the one from ContinueWith, or another overload of ContinueWith can be used which will be called only for tasks in faulted state. Whatever the approach, the async pipeline inside the loop started with the client.SendAsync call must be allowed to complete before the object is disposed.

@grendello
Copy link
Contributor

The second issue is wrapping Java exceptions in the .NET ones. This was addressed in this commit - please give it a try and see if that works for you. If it doesn't, please file a separate issue, thanks!

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

It is not correct that things must be awaited on. I have a continuation block that runs instead. When things dispose, requests should be cancelled. But the internals has severa; Task.Run calls that doesn't check if things were disposed while posting the operation to a different thread.

As also mentioned in the second callstack (which I don't have a simple repro for), requests has actually been cancelled before disposing.
Both issues are addressed by making SetupRequest non-async and catch the IOException canceled message and turn it into a proper cancelled request.

The second issue with wrapping exceptions is irrelevant here. My workaround handler technically works around 4 issues just to get some resemblance of stability that matches other platforms. However as I build a library and users are free to configure which ever handler they want to use, I can't really use these workarounds in production.

@grendello
Copy link
Contributor

@dotMorten The continuation does not handle the exceptions for you. The exceptions have to be observed. And the requests are disposed - if you look at the code you will see that we cancel the Java native request once it is set up in a cancellation handler. The exception in OP happens before we set up any wrappers or create any Java connections.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

Why are no other handlers exhibiting this behavior then? We literally run 1000s of tests on .net framework, .net core, uwp, iOS and Android. Android has historically been running way worse than the others but we finally traced it down to this bit where cancellations and dispose aren't handled right.

@grendello
Copy link
Contributor

Because other handlers don't call and manage an external native code process which does all the job. They're self-contained, AndroidClientHandler is not

@grendello
Copy link
Contributor

And the important part is - the code which starts the async/await pipeline must observe the exceptions, no matter how many internal async/await etc calls are used. All of those internal calls are part of the entire pipeline as started by, in your case, the client.SendAsync call.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

If what you're saying is true, then this should hit the UnobservedTaskException, but it is not:

Random r = new Random();
for (int i = 0; i < 100; i++)
{
    var _ = Task.Run(async () =>
    {
        await Task.Delay(10);
        if (r.Next(5) == 0)
            throw new Exception();
    });
}

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

Because other handlers don't call and manage an external native code process which does all the job.

That's not true. Most of them rely on an external platform-native handler, like WinInet, NSUrlSession etc. Regardless it's not really a reason why it needs to be more unstable than the others.

Btw I'm trying to set up a proper repro that reproduces the other scenario where in fact I AM awaiting the results, but the multi-threaded behavior is all little tricky to get right to reproduce correctly.

Just to reiterate. The repro above is simplified. I AM awaiting results in my real app.

@dotMorten
Copy link
Contributor Author

Here's some doc on using Task discards. The example even specifically talks about ignoring the thrown exception inside the task:
https://docs.microsoft.com/en-us/dotnet/csharp/discards#a-standalone-discard

@grendello
Copy link
Contributor

grendello commented Apr 14, 2020

If what you're saying is true, then this should hit the UnobservedTaskException, but it is not:

Random r = new Random();
for (int i = 0; i < 100; i++)
{
    var _ = Task.Run(async () =>
    {
        await Task.Delay(10);
        if (r.Next(5) == 0)
            throw new Exception();
    });
}

This code doesn't involve objects that are disposed owning the pipeline that was started - there are no finalizers involved either.

@grendello
Copy link
Contributor

Here's some doc on using Task discards. The example even specifically talks about ignoring the thrown exception inside the task:
https://docs.microsoft.com/en-us/dotnet/csharp/discards#a-standalone-discard

Once again, no objects which are involved in this code are disposed of. The sample from OP runs the TPL pipeline inside an object that is disposed in a racy manner.

@grendello
Copy link
Contributor

It appears that some versions of Mono don't discard the exception properly with the var _ = Task.Run expression, we're still investigating that. This might be the cause of the runaway exception, as the callee code has no way of knowing that the discard expression is used - so it cannot change its behavior. If we confirm that Mono does indeed fail to discard the exception, then the fix belongs in either TPL or the async/await infrastructure - in the Mono BCL. This doesn't change my earlier explanatory comment which still stands and we will open the issue with Mono if necessary.

@dotMorten
Copy link
Contributor Author

Still struggling to get a proper repro outside. There's just too much going on on too many threads, but I have verified I have absolutely no async-voids without proper try/catch and I have no tasks that aren't getting awaited (even the continue-withs have awaits outside them). I have narrowed it down to requests getting canceled though. Also I don't see any discard use in that code-path.

I've also at this point beyond doubt verified that this simple change to the handler will fix the issue. All it does is changing the exception type.

private class NativeHttpMessageHandler : Xamarin.Android.Net.AndroidClientHandler
{
	protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
	{
		try
		{
			return await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
		}
		catch (Java.IO.IOException ex)
		{
			if (ex.Message.Contains("Canceled"))
			{
				throw new OperationCanceledException("The operation was canceled.", ex);
			}
			throw;
		}
		catch (System.Exception ex)
		{
			throw;
		}
	}
}

The mere fact that I change the exception type shouldn't change other callers being observed. I'm starting to think the exception type makes all the difference. If I just catch and re-throw (or throw something else) the problem continues.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 14, 2020

@grendello It does indeed looks like cancellation exceptions are handled specifically:
https://github.com/dotnet/runtime/blob/4f9ae42d861fcb4be2fcd5d3d55d5f227d30e723/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/TaskExceptionHolder.cs#L136

That might explain why other handlers doesn't throw here, because they throw the (correct?) exception

@grendello
Copy link
Contributor

grendello commented Apr 15, 2020

While it's a good find (and Mono uses similar code), it is not something we can use (and I don't think other handlers behave this way either). We do throw the cancellation exception with cancellationToken.ThrowIfCancellationRequested(); when requested, at well-defined points. The cancellation exception can be thrown only once, however, so we can't wrap other exceptions in it like that. My original assessment still stands - caller code must handle all exceptions thrown from the TPL async/await pipeline, by properly waiting on the task(s) to complete. The Java cancellation exception you see is the effect of .NET cancellation, as we register a cancellation handler with the cancellation token used during the request (see here and here). Calling Disconnect on the Java connection causes it to throw the Java IOException with the Canceled message - but this is happening while our cancellation is in progres and the pipeline is about to throw the cancellation exception. Now, we can possibly try to handle that IOException in our code but it would have to be done in two modes, depending on the pipeline state - either rethrow it as-is or ignore it if cancellation is in progress. However, I'm not really sure whether this is what I want to implement. I'll ponder on it a bit more, though.

Regarding the other handlers - I don't think they deal with an API that throws external exceptions like the Android Java API does. Especially on macOS, IIRC, the cancellation behavior is well controlled with protocols, I imagine it's similar with Windows APIs. The Java API wrapped by AndroidClientHandler is not nearly as friendly and has many quirks we have to deal with, thus please don't expect identical behavior the other handlers surface.

Let me reiterate again - there's no escape from handling pipeline exceptions in the caller code. The pipeline (and all the code it runs) is free (as it should be) to throw any exceptions it wants, especially that there's no way we can control any code called from within the pipeline tasks (it can call 3rd party libraries etc).

@dotMorten
Copy link
Contributor Author

Spent days trouble-shooting this one. I ended up pulling all the HTTP code into mine to instrument it, as we just couldn't find anything that appeared to be unhandled. Finally tracked the unhandled exception down. It was incredibly subtle and actually only happens when a cancellation token had been cancelled. So yes you were totally right. I feel like an idiot :-)

However... I did find an issue while instrumenting the android handler, and was sort of the reason why we aren't getting a cancellation exception as we should have (which would have been incorrectly swallowed and we'd been none-the-wiser).

What I found was that first this Task.Run would initiate:
https://github.com/xamarin/xamarin-android/blob/3fee2438fca092f120fc23da407587ba45187345/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs#L426
Then while it's posting, the cancellation code would trigger:
https://github.com/xamarin/xamarin-android/blob/3fee2438fca092f120fc23da407587ba45187345/src/Mono.Android/Xamarin.Android.Net/AndroidClientHandler.cs#L416-L421
After it disconnected, the Action from the first task would execute, and it would try and get the RepsonseCode from the now disposed HttpConnection. This would throw an object disposed exception, when really a cancellation exception should have happened.

Now the funny part is, if I added a check for cancellation right before getting the status, it would always go past that, and in true race-condition mania get disposed right as the response code is getting pulled. I have a feeling the reason for that race condition being so reproducible was because it was all happening during GC.

So the best fix I found was to add a catch right before finally with:

catch
{
   cancellationToken.ThrowIfCancellationRequested()
   throw;
}

So I do think there's a genuine bug here and this issue should be reopened. (it's just not an unhandled exception, but the wrong exception being thrown).
It's probably fairly common that people just move on from tasks and don't bother observing them once they have been cancelled (like we were), and likely why cancellation exceptions aren't ever raised as unhandled.

@grendello
Copy link
Contributor

Spent days trouble-shooting this one. I ended up pulling all the HTTP code into mine to instrument it, as we just couldn't find anything that appeared to be unhandled. Finally tracked the unhandled exception down. It was incredibly subtle and actually only happens when a cancellation token had been cancelled. So yes you were totally right. I feel like an idiot :-)

There's no need to feel this way :) Threading is tricky on its own and full of surprises. It gets worse, IMO, with APIs which aim to hide the complexity behind elaborate machinery. They work 99% of time, but when that 1% fails you're left staring at the screen and wondering what went wrong as it can be any of a thousand things either in the said machinery or in your code. That leads to arduous process of discovery, as you've experienced yourself.

After it disconnected, the Action from the first task would execute, and it would try and get the RepsonseCode from the now disposed HttpConnection. This would throw an object disposed exception, when really a cancellation exception should have happened.

Now the funny part is, if I added a check for cancellation right before getting the status, it would always go past that, and in true race-condition mania get disposed right as the response code is getting pulled. I have a feeling the reason for that race condition being so reproducible was because it was all happening during GC.
Yeah, at this stage things happen so fast that I'm not surprised by this at all.

So the best fix I found was to add a catch right before finally with:

catch
{
   cancellationToken.ThrowIfCancellationRequested()
   throw;
}

So I do think there's a genuine bug here and this issue should be reopened. (it's just not an unhandled exception, but the wrong exception being thrown).

While ugly, the code above is perfectly valid. However, I think what we might do instead in AndroidClientHandler code, is to pass the cancellation token to the status Task.Run call. It should take care of the case when the task is not started yet and the cancellation arrives. I'll open a PR to that effect.

It's probably fairly common that people just move on from tasks and don't bother observing them once they have been cancelled (like we were), and likely why cancellation exceptions aren't ever raised as unhandled.

Yes, it's because async/await (and to a lesser extent TPL itself) project the air of being simple and easy, and it happens that while writing the code we simply forget that the code is not, in fact, our "plain, old code" but a state machine that slices up our neat code into discrete steps. Inside those hidden steps, and in between them, is where lots hidden things happen leading to issues like yours.

grendello added a commit to grendello/xamarin-android that referenced this issue Apr 21, 2020
Context: dotnet#4550

It is possible that, in certain situations, a cancellation request
arriving from code external to `AndroidClientHandler` will not be
satisfied while starting the task to acquire connection status, because
the cancellation request appears after handler already established
connection to the remote host but before the request status is obtained.

Failure to check that the cancellation was requested may result in the
status retrieval task starting only to run into the issue that the
`AndroidClientHandler` object itself is already disposed (since the
cancellation request has been applied) and throw the
`ObjectDisposedException` instead of cancelling the status request
quietly.  Note that this is likely happen only in situations when the
application code aggressively cancels and collects (via calls to the
`GC` class) `AndroidClientHandler` instances or in other situations
which induce potential for timing issues in the TPL or `async/await`
state machine, thus leading to possible race conditions there.

This commit passes cancellation token to the task which runs the status
request, thus preventing it from starting in the situation described
above.  The possibility of a race condition still remains, but is
seriously reduced.
@grendello
Copy link
Contributor

@dotMorten please see if PR #4589 does the trick for you. The reason why I don't want to add the catch solution is that it, effectively, swallows some exception silently and we mustn't do anything like that in library code (it's not a good idea in general, but especially in SDKs or libraries). I'd rather it leak to the application code instead.

@dotMorten
Copy link
Contributor Author

@grendello You change would make sense, but that doesn't work for me. But as I mentioned the disconnect happens right between the task getting started and getting the response code.
I added some instrumentation to it to show the flow:

image
Also added output for unhandled exceptions and when I start/stop a forced GC:

Here's what I'm seeing in the output:

[0:] ***** StatusCode Task started. Getting response code
[0:] ***** Initiating disconnect from cancellation...
[0:] ***** Disconnect complete
[0:] ***** Starting forced full GC
[0:] *************** UNHANDLED EXCEPTION ***************
[0:] System.AggregateException: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. (Canceled)
Inner Exception
  System.Net.Http.HttpRequestException: Canceled
  StackTrace: 
    at Esri.ArcGISRuntime.Http.ArcGISHttpClientHandler+NativeHttpMessageHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken, System.Boolean isRetry) [0x001b2] in E:\apps\dotnet\dotnet-api\src\Esri.ArcGISRuntime\Esri.ArcGISRuntime.Shared\Http\ArcGISHttpClientHandler.cs:1454 
  Inner Exception
    Java.IO.IOException: Canceled
    StackTrace: 
      at Java.Interop.JniEnvironment+InstanceMethods.CallIntMethod (Java.Interop.JniObjectReference instance, Java.Interop.JniMethodInfo method, Java.Interop.JniArgumentValue* args) [0x0006e] in <26521a5118b44c858c385715922b9d5d>:0 
      at Java.Interop.JniPeerMembers+JniInstanceMethods.InvokeVirtualInt32Method (System.String encodedMember, Java.Interop.IJavaPeerable self, Java.Interop.JniArgumentValue* parameters) [0x0002a] in <26521a5118b44c858c385715922b9d5d>:0 
      at Java.Net.HttpURLConnection.get_ResponseCode () [0x00000] in /Users/runner/runners/2.165.0/work/1/s/xamarin-android/src/Mono.Android/obj/Release/android-28/mcw/Java.Net.HttpURLConnection.cs:505 
      at Xamarin.Android.Net.AndroidClientHandler+<>c__DisplayClass46_0.<DoProcessRequest>b__2 () [0x0000c] in E:\apps\dotnet\dotnet-api\src\Esri.ArcGISRuntime\Esri.ArcGISRuntime.Android\Xamarin.Android.Net\AndroidClientHandler.cs:431 
      at System.Threading.Tasks.Task`1[TResult].InnerInvoke () [0x0000f] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs:534 
      at System.Threading.Tasks.Task.Execute () [0x00000] in /Users/builder/jenkins/workspace/archive-mono/2019-10/android/release/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs:2319 
    --- End of stack trace from previous location where exception was thrown ---
    
      at Xamarin.Android.Net.AndroidClientHandler.DoProcessRequest (System.Net.Http.HttpRequestMessage request, Java.Net.URL javaUrl, Java.Net.HttpURLConnection httpConnection, System.Threading.CancellationToken cancellationToken, Xamarin.Android.Net.AndroidClientHandler+RequestRedirectionState redirectState) [0x003b1] in E:\apps\dotnet\dotnet-api\src\Esri.ArcGISRuntime\Esri.ArcGISRuntime.Android\Xamarin.Android.Net\AndroidClientHandler.cs:428 
      at Xamarin.Android.Net.AndroidClientHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) [0x00380] in E:\apps\dotnet\dotnet-api\src\Esri.ArcGISRuntime\Esri.ArcGISRuntime.Android\Xamarin.Android.Net\AndroidClientHandler.cs:286 
      at Esri.ArcGISRuntime.Http.ArcGISHttpClientHandler+NativeHttpMessageHandler.SendAsync (System.Net.Http.HttpRequestMessage request, System.Threading.CancellationToken cancellationToken, System.Boolean isRetry) [0x00050] in E:\apps\dotnet\dotnet-api\src\Esri.ArcGISRuntime\Esri.ArcGISRuntime.Shared\Http\ArcGISHttpClientHandler.cs:1439 
[0:] ***************************************************
***** Full GC complete

As you can see it actually enters the start of the task, but then crashes getting the response code.

I'm starting to understand why that seemingly impossible race condition is so reproducible: We request to complete once headers has been received, not when the full response has been received. We do this because some responses are very large, and we don't want to load everything into memory first but instead parse the data as it trickles in. The problem does indeed go away if I were to not do that. So I'm guessing that getting the status code takes a long time, because it doesn't actually complete until the entire stream has been read? (DoOutput is false in my case btw)

@grendello
Copy link
Contributor

@dotMorten I'll address your comment a bit later, but for now just one question - are you using the HEAD request method for your queries? It would return just the headers, without the content GET would return.

@dotMorten
Copy link
Contributor Author

no this is a GET request, with completion-option ResponseHeadersRead

@grendello
Copy link
Contributor

@dotMorten is there a reason you're not using HEAD for this?

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 21, 2020

@grendello HEAD is if you only want the header. I still want the full response, but want to get control back as soon as the headers has been received, so I can start processing the response as it trickles in. Sometimes the responses can be several megabytes (up to several hundreds), and I don't want that read into memory before starting to read through the contents, or dumping it to disk. That way I only ever allocate more or less however large my buffer size is, rather than the entire response.

@grendello
Copy link
Contributor

@dotMorten what's wrong with:

$ telnet httpbin.org 80
Trying 35.170.216.115...
Connected to httpbin.org.
Escape character is '^]'.
HEAD / HTTP/1.1
Host: httpbin.org

HTTP/1.1 200 OK
Date: Tue, 21 Apr 2020 16:52:56 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

GET / HTTP/1.1
Host: httpbin.org

HTTP/1.1 200 OK
Date: Tue, 21 Apr 2020 16:53:07 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 9593
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true

<!DOCTYPE html>
<html lang="en">

<head>

You don't have to disconnect to issue more than one requests with HTTP 1.1+ - just keep the connection alive, and do what I did above.

@dotMorten
Copy link
Contributor Author

Huh? Why would I make two requests? I'm not testing whether the request will go through, I'm avoiding allocating the entire response. I think you're misunderstanding what the HttpCompletionOption is for in the HttpClient.

@stevejgordon wrote a good blogpost recently on it: https://www.stevejgordon.co.uk/using-httpcompletionoption-responseheadersread-to-improve-httpclient-performance-dotnet

@grendello
Copy link
Contributor

HEAD isn't for testing whether request will go through. It's to get the resource size to allocate memory/disk space etc if you plan to request a resource that is large - in other words, it's designed to address precisely your scenario. And HTTP 1.1 allows you to stream more than one request over a single connection - without having to make the connection twice (which is the most expensive part of the HTTP request itself). And it is important in this case because the Java code behind the scenes starts downloading the resource you requested - it doesn't know, nor it doesn't care, about HttpCompletionOption. If, however, you first issue a HEAD request, prepare to get the resource and then issue the GET request, the problem you're seeing will cease to exist.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 21, 2020

Errr say what? No. A HEAD request will only return the headers from the server, and not the contents. I mean to do a GET request and actually get the contents.
There's no resource here to prepare. I just want a stream I can sequentially read through as it comes down, and throw away bytes as I've read them. Again I think you're confusing HttpCompletionOptions with HEAD and GET requests. They have nothing to do with each other.

For sake of (an extreme) argument, let's say you want to make a GET request to a 10gb file: If you don't use the HeadersRead HttpCompletionOption, your app will run out of memory before the GET request even returns. Doing a HEAD request first would never get around that. You'd still have to do the GET request (and wasted a HEAD request) and end up with trying to load it into memory, which would fail.

Another more reasonable scenario is providing download-progress: You'd report progress as the bytes trickle in from the stream. If you don't use the headersread completion option, you wouldn't even get the stream until the entire download has completed and loaded into memory.

@grendello
Copy link
Contributor

I think you keep missing the point. I see you understand very well how HttpCompletionOptions work but they are completely irrelevant here. What's relevant is that:

  1. You abort the request early because because, quoting your words, We request to complete once headers has been received, not when the full response has been received. We do this because some responses are very large
    This suggests to me that you don't really always read all the data, trickle or not, that you sometimes abort the request because you don't really want/need the data based on the HEADERS. Since you cancel the request
    shortly after it started downloading, after you read the headers, then it means you do not need the data and you should have used HEAD to get the headers you need without bothering with HttpCompletionOptions.
    If by We request to complete you mean HttpCompletionOptions then it does nothing at all in this scenario and, furthermore, I don't understand why you cancel the request before everything is read? If you did not cancel, then
    the whole resource would be read by the Java backend and then read by you piecemeal from a stream. If you do cancel, that means you do not need the data and you doing more work than is necessary.
  2. Once you issue the GET request it gets to the Java backend which will start the full download of the resource, heedless of whatever HttpCompletionOptions you used as it has no way of knowing them
  3. As soon as you abort the managed workflow, we disconnect the Java request which is in progress so it throws the exception - again, this would be completely avoided if you used HEAD to decide whether or not you
    really need to download this particular resource. If you did that, the early cancellation would not be necessary at all as the decision would have been made based on HEAD results.
  4. You misunderstood the point of my telnet session above. It was to show that you can efficiently do what you need to do - parse the headers first, make the decision if you want to handle that particular resource and how and then
    either issue GET or close the connection. In no place any request is "wasted" here - it's the most efficient approach.

Let me stress that the request is NOT handled in managed code with AndroidClientHandler, but with Java code so, yes, it will only return when either it's aborted or after it completes the whole read.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 21, 2020

You abort the request early because because, quoting your words, We request to complete once headers has been received, not when the full response has been received.

Complete the TASK not the Request itself. I want the task to return once the header has been received. Whether I cancel has nothing to do with the header. I might just not care about the response any longer half way through downloading it (cancel download, leave page etc).

If it doesn't complete the task until after the whole request has been downloaded into memory we have a huge issue here. IMHO that needs to be fixed, as there's no way to do progress reporting or download large files then.

@grendello
Copy link
Contributor

@dotMorten I'm sorry, I don't have anything else to add. What I wrote above in all of my comments still stands. If you feel there's another issue with the code, please open another issue and describe the problem there. Thanks.

@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 21, 2020

Sure. Your comments are fine but based on a misunderstanding what I'm trying to achieve and how the HttpMessageHandlers are supposed to work with the completion option. At no point did I say I need to do a HEAD request but you keep insisting I should use that.

Of course even the android handler can be made to do this even if it's java. You'd just have to create a custom stream object that wraps the connection and cuts that connection once the stream disposes. I'll open an issue describing how the handler is not following the contract if it truly does what you say it does.

grendello added a commit to grendello/xamarin-android that referenced this issue Apr 22, 2020
Context: dotnet#4550

It is possible that, in certain situations, a cancellation request
arriving from code external to `AndroidClientHandler` will not be
satisfied while starting the task to acquire connection status, because
the cancellation request appears after handler already established
connection to the remote host but before the request status is obtained.

Failure to check that the cancellation was requested may result in the
status retrieval task starting only to run into the issue that the
`AndroidClientHandler` object itself is already disposed (since the
cancellation request has been applied) and throw the
`ObjectDisposedException` instead of cancelling the status request
quietly.  Note that this is likely happen only in situations when the
application code aggressively cancels and collects (via calls to the
`GC` class) `AndroidClientHandler` instances or in other situations
which induce potential for timing issues in the TPL or `async/await`
state machine, thus leading to possible race conditions there.

This commit passes cancellation token to the task which runs the status
request, thus preventing it from starting in the situation described
above.  The possibility of a race condition still remains, but is
seriously reduced.
@dotMorten
Copy link
Contributor Author

dotMorten commented Apr 22, 2020

So built a repro case for the scenario that the handler doesn't return until all data has been read so I could log an issue for it. The good news is: It does return control back to the caller once the headers have been read. So I confirmed that what you stated here isn't true:

it will only return when either it's aborted or after it completes the whole read.

Here's a bit of Forms code that shows bytes downloaded as they trickle in from the network:

       public async Task DownloadFile(string url, CancellationToken cancellationToken)
        {
            try
            {
                HttpClient client = new HttpClient();

                var response = await client.GetAsync(url, HttpCompletionOption.ResponseHeadersRead, cancellationToken);

                byte[] buffer = new byte[4 * 1024];
                long progress = 0;
                using (var readstream = await response.Content.ReadAsStreamAsync().ConfigureAwait(false))
                {
                    while (true)
                    {
                        var read = await readstream.ReadAsync(buffer, 0, buffer.Length, cancellationToken).ConfigureAwait(false);
                        progress += read;
                        if (read == 0)
                            break;
                        Device.BeginInvokeOnMainThread(() =>
                        {
                            status.Text = $"{progress} bytes downloaded...";
                        });
                    }
                }
                Device.BeginInvokeOnMainThread(() =>
                {
                    status.Text = $"Completed downloading {progress} bytes";
                });

            }
            catch (OperationCanceledException)
            {
                Device.BeginInvokeOnMainThread(() =>
                {
                    status.Text = $"Download cancelled";
                });
            }
        }

I validated that this is in fact using the default AndroidClientHandler:
image
Also the behavior of the app completely changes if I change the HttpCompletionOption from ResponseHeadersRead to ResponseContentRead (no status update until it's fully downloaded.

So my argument still stands: You need to be able to handle cancellation at any point during the download, which might be after this task returned, but the download is still in-progress.

grendello added a commit to grendello/xamarin-android that referenced this issue Apr 27, 2020
Context: dotnet#4550

It is possible that, in certain situations, a cancellation request
arriving from code external to `AndroidClientHandler` will not be
satisfied while starting the task to acquire connection status, because
the cancellation request appears after handler already established
connection to the remote host but before the request status is obtained.

Failure to check that the cancellation was requested may result in the
status retrieval task starting only to run into the issue that the
`AndroidClientHandler` object itself is already disposed (since the
cancellation request has been applied) and throw the
`ObjectDisposedException` instead of cancelling the status request
quietly.  Note that this is likely happen only in situations when the
application code aggressively cancels and collects (via calls to the
`GC` class) `AndroidClientHandler` instances or in other situations
which induce potential for timing issues in the TPL or `async/await`
state machine, thus leading to possible race conditions there.

This commit passes cancellation token to the task which runs the status
request, thus preventing it from starting in the situation described
above.  The possibility of a race condition still remains, but is
seriously reduced.
jonpryor pushed a commit that referenced this issue Apr 30, 2020
…4589)

Context: #4550

It is possible that, in certain situations, a cancellation request
arriving from code external to `AndroidClientHandler` will not be
satisfied while starting the task to acquire connection status,
because the cancellation request appears after handler already
established connection to the remote host but before the request
status is obtained.

Failure to check that the cancellation was requested may result in the
status retrieval task starting only to run into the issue that the
`AndroidClientHandler` object itself is already disposed (since the
cancellation request has been applied) and throw an
`ObjectDisposedException` instead of cancelling the status request
quietly.  Note that this is likely happen only in situations when the
application code aggressively cancels and collects
`AndroidClientHandler` instances (e.g. via calls to `GC.Collect()`)
or in other situations which induce potential for timing issues in
the TPL or `async/await` state machine, thus leading to possible race
conditions there.

Pass the cancellation token to the task which runs the status request,
thus preventing it from starting in the situation described above.
The possibility of a race condition still remains, but is reduced.
jonpryor pushed a commit that referenced this issue May 6, 2020
…4589)

Context: #4550

It is possible that, in certain situations, a cancellation request
arriving from code external to `AndroidClientHandler` will not be
satisfied while starting the task to acquire connection status,
because the cancellation request appears after handler already
established connection to the remote host but before the request
status is obtained.

Failure to check that the cancellation was requested may result in the
status retrieval task starting only to run into the issue that the
`AndroidClientHandler` object itself is already disposed (since the
cancellation request has been applied) and throw an
`ObjectDisposedException` instead of cancelling the status request
quietly.  Note that this is likely happen only in situations when the
application code aggressively cancels and collects
`AndroidClientHandler` instances (e.g. via calls to `GC.Collect()`)
or in other situations which induce potential for timing issues in
the TPL or `async/await` state machine, thus leading to possible race
conditions there.

Pass the cancellation token to the task which runs the status request,
thus preventing it from starting in the situation described above.
The possibility of a race condition still remains, but is reduced.
@ghost ghost locked as resolved and limited conversation to collaborators Jun 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Area: App Runtime Issues in `libmonodroid.so`.
Projects
None yet
Development

No branches or pull requests

3 participants