Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Our code for recovering after a "foreign await" needs to be reworked #552

Closed
dhirschfeld opened this issue Jul 1, 2018 · 7 comments · Fixed by #1178
Closed

Our code for recovering after a "foreign await" needs to be reworked #552

dhirschfeld opened this issue Jul 1, 2018 · 7 comments · Fixed by #1178

Comments

@dhirschfeld
Copy link
Member

Just doing some testing to see if I can use dask.distributed with trio & trio_asyncio but have fallen at the first hurdle and hit a TrioInternalError:

import asyncio
import trio
import trio_asyncio
from dask.distributed import Client


async def f():
    async with trio_asyncio.open_loop() as loop:
        client = await Client(asynchronous=True)
        future = client.submit(lambda x: x + 1, 10)
        return await future

trio.run(f)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
C:\Miniconda3\lib\site-packages\trio\_core\_run.py in run(async_fn, clock, instruments, restrict_keyboard_interrupt_to_checkpoints, *args)
   1228                     # is just to get rid of this extra indentation.
-> 1229                     result = run_impl(runner, async_fn, args)
   1230             except TrioInternalError:

C:\Miniconda3\lib\site-packages\trio\_core\_run.py in run_impl(runner, async_fn, args)
   1356                 # other exceptions as __context__ in unwanted ways.
-> 1357                 runner.task_exited(task, final_result)
   1358             else:

C:\Miniconda3\lib\site-packages\trio\_core\_run.py in task_exited(self, task, result)
    854             # the init task should be the last task to exit
--> 855             assert not self.tasks
    856         else:

AssertionError: 

The above exception was the direct cause of the following exception:

TrioInternalError                         Traceback (most recent call last)
<ipython-input-41-03c750b89f65> in <module>()
----> 1 trio.run(f)

C:\Miniconda3\lib\site-packages\trio\_core\_run.py in run(async_fn, clock, instruments, restrict_keyboard_interrupt_to_checkpoints, *args)
   1233                 raise TrioInternalError(
   1234                     "internal error in trio - please file a bug!"
-> 1235                 ) from exc
   1236             finally:
   1237                 GLOBAL_RUN_CONTEXT.__dict__.clear()

TrioInternalError: internal error in trio - please file a bug!

Happy to move this to the trio_asyncio repo if that's a more appropriate place...

@dhirschfeld
Copy link
Member Author

dhirschfeld commented Jul 1, 2018

For some context, dask.distributed runs on top of tornado (>=5.0) which itself uses the asyncio event loop so I had hoped I could use it through trio_asyncio.

I'm fine if what I'm trying to do is too esoteric to support, I just thought it would be interesting/cool to be able to use distributed with trio rather than asyncio and had hoped it might Just Work. I was hoping to use trio as I think it will make my own code easier to understand/maintain and therefore reduce the chance I shoot myself in the foot (which I've done with asyncio!)

I'm mostly opening this issue in case it's of interest or an easy/obvious fix.

@smurfix
Copy link
Contributor

smurfix commented Jul 2, 2018

Yes, trio_asyncio is the right place for this bug.

No, you can't do that. You're still in a Trio context; the fact that you just started an asyncio loop doesn't change that.

This code should work (warning, literal translation but untested):

from functools import partial

async def f():
    async with trio_asyncio.open_loop() as loop:
        client = await loop.run_asyncio(partial(Client, asynchronous=True))
        future = client.submit(lambda x: x + 1, 10)
        return await loop.run_future(future)

though you should be able to combine the last two lines to

        return await loop.run_asyncio(client.submit, lambda x: x + 1, 10)

@smurfix
Copy link
Contributor

smurfix commented Jul 2, 2018

… and yes I need to improve error handling for this kind of mistake.

@smurfix
Copy link
Contributor

smurfix commented Jul 2, 2018

@njsmith
Copy link
Member

njsmith commented Jul 2, 2018

That error message from trio's side is pretty bad too... in general we should probably tweak that TrioInternalError so it says that hazmat misuse can also cause this error. In this specific case I'm also confused about how you could have gone down that particular path... trio tries to provide good error messages if you await a foreign function in trio context. Why didn't it fire this time?

(Re-opening so we don't forget these questions, though @smurfix may already know the answer.)

@njsmith njsmith reopened this Jul 2, 2018
@njsmith njsmith changed the title TrioInternalError with trio_asyncio & distributed/tornado Poor error messages from trio after hitting an issue with trio_asyncio + distributed/tornado Jul 30, 2018
@njsmith
Copy link
Member

njsmith commented Jan 27, 2019

The link from #882 just reminded me to look at this again, and I figured it out. Here's a minimal reproducer:

import trio
import asyncio

async def f():
    async with trio.open_nursery() as nursery:
        nursery.start_soon(trio.sleep, float("inf"))
        future = asyncio.Future()
        await future

trio.run(f)
  • The main task creates a nursery, and starts a background task running.

  • Then it does await future

  • Trio detects that it's awaited a future, and tries to cope with this by killing the task. There's a long comment about this:

    trio/trio/_core/_run.py

    Lines 1488 to 1503 in 360f1fb

    exc = TypeError(
    "trio.run received unrecognized yield message {!r}. "
    "Are you trying to use a library written for some "
    "other framework like asyncio? That won't work "
    "without some kind of compatibility shim.".format(msg)
    )
    # How can we resume this task? It's blocked in code we
    # don't control, waiting for some message that we know
    # nothing about. We *could* try using coro.throw(...) to
    # blast an exception in and hope that it propagates out,
    # but (a) that's complicated because we aren't set up to
    # resume a task via .throw(), and (b) even if we did,
    # there's no guarantee that the foreign code will respond
    # the way we're hoping. So instead we abandon this task
    # and propagate the exception into the task's spawner.
    runner.task_exited(task, Error(exc))

    As you can see, we actually zap the task (stop scheduling it, don't run any cleanup code, etc.) and hope for the best. This is very clever but...

  • In our example program and in @dhirschfeld's program, this happens in the main task, so the main task gets zapped.

  • Since the main task has exited, trio thinks that it's time for the whole program to exit... but we still have a task running (the trio.sleep in our example, trio-asyncio's background machinery in @dhirschfeld's example)

  • trio.run explodes

So I think we need to give up on the clever task-zapping trick. Other options:

  • Immediately blow up the whole program, instead of just the one task.
  • Dump a warning to the console, and then throw in an exception and hope for the best.

@njsmith njsmith changed the title Poor error messages from trio after hitting an issue with trio_asyncio + distributed/tornado Our code for recovering after a "foreign await" needs to be reworked Jan 27, 2019
@njsmith
Copy link
Member

njsmith commented Jan 27, 2019

thinking about this a bit more:

Blowing up the whole program is by far the easiest thing to do, but unfortunately it means users don't get any info at all about where they awaited the wrong thing, which is a pretty terrible UX :-(

We could like... recursively zap the whole task tree underneath the offending task... but this seems like a mess

So... I guess that means we need to immediately resume the task by .throwing in an exception? I think that actually will work on both asyncio and curio. I've just been super-reluctant to go down that road because it complexifies the rest of trio's run loop (.throw is buggy, so we never use it otherwise → resuming via .throw complexifies the whole task state tracking machinery). But I don't see any alternative.

oremanj added a commit to oremanj/trio that referenced this issue Aug 7, 2019
This improves debuggability (since the traceback points to the location of the error, rather than just the nursery whose task was at fault) and fixes python-trio#552 by ensuring we don't abandon the child tasks of the errant one.
oremanj added a commit to oremanj/trio that referenced this issue Aug 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants