Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Jupyter Server should handle resolving kernel lifecycle and execution states. #990

Open
Zsailer opened this issue Sep 22, 2022 · 6 comments

Comments

@Zsailer
Copy link
Member

Zsailer commented Sep 22, 2022

Frontend applications should be able to fetch the lifecycle states (starting, connecting, connected, terminating, dead, restarting, ...) and execution states (busy, idle, etc.) of a given kernel. The server should be in charge of resolving these states, since it talks directly to the kernel.

As mentioned in previous issue, today, frontends are forced to resolve the kernel states by tracking IOPub status messages. There are many problems with relying on the client to listen to the IOPub stream. See that issue for more details.

I propose that we make Jupyter Server's kernel manager responsible for storing, tracking, and returning the state of its kernel. We should add a REST API to the kernels service to fetch the state of a kernel, e.g.

GET /api/kernels/{kernel-id}/state

RESPONSE:
{
  "execution_state": "idle",
  "lifecycle_state": "connected"
}

We can also consider leveraging the Jupyter Event system to emit notifications with the kernel state changes.

This proposal would address #989

@krassowski
Copy link
Collaborator

It's probably out of scope for now, but I believe that it would be useful if in future kernels could provide progress updates (rather than simple idle/busy). The proposed response format seems easily extensible to include e.g. "execution_progress": 0.5 so +1.

I would say that for user experience the latency matters here so if implementation as an event means that there is no delay due to pooling in longer intervals it might be better (but I have no insight if that is the case).

@krassowski
Copy link
Collaborator

Coming back here from jupyterlab/jupyterlab#16059. This proposal would solve jupyterlab/jupyterlab#16059 (the kernel status could be properly updated after refresh), but it does not solve the problem of status for individual cells. I am talking about the execution indicators like [*] - these would still not show up even if we implement this proposal.

In jupyterlab/jupyterlab#16059 I was thinking about storing the information about pending execution on frontend. Because a pending execution request may have been completed during page reload, frontend would need a way to ask the kernel (server?) about the status of that execution request (which can be identified by message id). This could be solved by extending the proposed jupyter-server API with:

GET /api/kernels/{kernel-id}/{execution-request-id}/state

jupyter-server would then keep track of the kernel status updates in a dictionary with binary values. Thoughts?

(and in future those could be made floats to represent progress ;))

@davidbrochart
Copy link
Contributor

Going a step further, I think the long-term solution is having server-side execution, and not having frontends deal with the kernel protocol in the first place.

@krassowski
Copy link
Collaborator

Server-side execution as implemented in jupyterlab/jupyterlab#15448 does not solve jupyterlab/jupyterlab#16059 itself. I think these two solutions are working in lockstep.

@davidbrochart
Copy link
Contributor

Not yet, but for instance jupyter-server/jupyter_ydoc#197 goes in that direction. The cell execution state could be included in the notebook shared model, and the kernel execution state could be inferred from all cells' execution state.

@krassowski
Copy link
Collaborator

I think this is a reasonable idea to store these in the shared model.

In any case, I think we need more than execution_state: str. While it is nice that upon refresh cells would show up as busy/pending execution, when you get a new status from the kernel saying idle, how would you know which cell was it for?

What I am saying is that maybe we need to store a list of objects mapping to each of the pending execution requests. Something like:

class Cell:
   pending_requests: Request[]

class Request:
   id: str
   type: 'stdin_request' | 'execute_request' 

Now, execute_request and stdin_request are quite different, the former is sent from client to kernel, the latter is send from kernel to client, so I am not sure if we want to keep them together, but stdin_request is equally important because without a way of restoring the input box the kernel ends in a deadlock after refresh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants