Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process exit handler for tracking unresolved asynchronous work ("promise deadlocks"/"why is node terminating for no reason?") #307

Closed
CMCDragonkai opened this issue Dec 13, 2021 · 3 comments · Fixed by #445
Assignees
Labels
enhancement New feature or request r&d:polykey:supporting activity Supporting core activity

Comments

@CMCDragonkai
Copy link
Member

Is your feature request related to a problem? Please describe.

Coming from #296 (comment).

Promises by themselves don't keep the process open.

This means if you create a promise that is intended to be resolved by an event emitter, or some other event. If that event is not emitted, or never emitted. Then the entire Node process will EXIT at that point, and the rest of the code after awaiting the promise will not be executed. This is known as "promise deadlock". nodejs/node#22088 (comment)

I encountered this while trying to create a FlowCountInterceptor that was not coded correctly and resulted in a promise that never resolves.

One way to fix this, is to use poll instead of awaiting for an event driven promise. It turns out looping setTimeout keeps the event loop alive whereas waiting for an internal event does not. In fact you can unref a timer: https://nodejs.org/api/timers.html. Waiting for IO events usually does block the event loop. Unfortunately the EventEmitter does not. There's some extra stuff about node timers and event loop internals that one has to learn here: https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/

Another thing to help us to is to use our ExitHandlers to attach handlers for "unresolved" promises, which can help us detect this kind sneaky bug in the future. This would have to use a combination of beforeExit and also the new async hooks module.

Quick hack to detect these deadlocks:

process.on('beforeExit', (code) => {
  if (process.exitCode == null) {
    console.log('EXIT WITH DEADLOCK', code);
  }
});

Note that if you were to put setTimeout to then emit the event, nodejs won't terminate until the setTimeout is resolved. This is because a timer event is put on the event loop. It's not that node knows that the promise will eventually resolve, it's simply that there's still work to do.

This only happens when you are awaiting on a promise or expecting a callback to be called, but no work is available on the event loop.

Describe the solution you'd like

Incorporate the beforeExit handler into the ExitHandlers and create a relevant exception that indicates when this occurs. Something like ErrorPolykeyAsynchronousDeadlock. This should tell us that somewhere we are waiting for a promise that never resolves, and Node decides to just exit there because there is no other work on the event loop.

We won't have a way to write into the exception where this promise being awaited for is coming from though. So using a tracing debugger would be necessary. But it's better than not having any message at all.

Also doing this will require mocking in pkStdio as well to prevent this handler from being attached to the jest process.

Describe alternatives you've considered

Consider using the async_hooks API that is available to track it.

Additional context

  1. Nodejs does not wait for promise resolution - exits instead nodejs/node#22088
  2. Promise.all() dies silently if no resolve nodejs/node#29355
  3. Process exits with unresolved promises nodejs/promises-debugging#16

Note that for the opposite problem: "Why is node still running?", there are these solutions:

@CMCDragonkai
Copy link
Member Author

@tegefaulkes debuggers don't work for async operations that don't have a hierarchical context.

It seems the same async hooks here that can be used to monitor asynchronous runtime in NodeJS should also help debug that. We may want to look into building in monitoring there.

@tegefaulkes
Copy link
Contributor

I've added this in #445 to partially address #414

@CMCDragonkai
Copy link
Member Author

The reason beforeExit works is because:

  1. Nothing is in event loop
  2. Not due to explicit exit
  3. Not due to explicit exception thrown, or uncaught exception or unhandled rejection (<- confirm please)
  4. And the exitCode shouldn't be set, cause we only set it when we are about to exit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request r&d:polykey:supporting activity Supporting core activity
Development

Successfully merging a pull request may close this issue.

2 participants