Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output of type stream is split over multiple cells #278

Closed
cscheid opened this issue Mar 30, 2023 · 10 comments · Fixed by #279
Closed

output of type stream is split over multiple cells #278

cscheid opened this issue Mar 30, 2023 · 10 comments · Fixed by #279

Comments

@cscheid
Copy link

cscheid commented Mar 30, 2023

Hi! Over at quarto, we're seeing execution behavior that doesn't match that of jupyter lab or jupyter notebook for cells that produce streaming output. Specifically, the output is unpredictably split across multiple cells. This is a problem because multiple cells are expected to be in the output.

Consider this repro (I apologize for requiring the julia kernel, this is the one open-source kernel we've been able to reproduce the issue with)

nbclient_test.py

#!/usr/bin/env python3

import nbformat
import sys
from nbclient import NotebookClient
import atexit

notebook_filename = sys.argv[1]
nb = nbformat.read(notebook_filename, as_version=4)
client = NotebookClient(nb, timeout=600, kernel_name='julia-1.8')

client.create_kernel_manager()
client.start_new_kernel()
client.start_new_kernel_client()
atexit.register(client._cleanup_kernel)

for index, cell in enumerate(client.nb.cells):
  if cell.cell_type == 'code':
    cell = client.execute_cell(
        cell = cell,
        cell_index = index)
    client.nb.cells[index] = cell
nbformat.write(client.nb, sys.argv[2])

test.ipynb

{
  "cells": [
    {
      "cell_type": "code",
      "metadata": {},
      "source": [
        "for x ∈ (42, 12.0, 3.3e4, \"Hallo!\", true, 1+2im)\n",
        "    println(\"x = \", x, \" ..... type: \", typeof(x))\n",
        "    flush(stdout)\n",
        "end"
      ],
      "execution_count": null,
      "outputs": []
    }
  ],
  "metadata": {
    "kernelspec": {
      "name": "julia-1.8",
      "language": "julia",
      "display_name": "Julia 1.8.5"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 4
}

steps to repro

$ python nbclient_test.py test.ipynb out.ipynb
$ cat out.ipynb
{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "execution": {
     "iopub.execute_input": "2023-03-30T08:46:58.042000Z",
     "iopub.status.busy": "2023-03-30T08:46:57.788000Z",
     "iopub.status.idle": "2023-03-30T08:46:58.531000Z",
     "shell.execute_reply": "2023-03-30T08:46:58.508000Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = 42 ..... type: Int64"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = 12.0 ..... type: Float64\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = 33000.0 ..... type: Float64\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = Hallo! ..... type: String\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = true ..... type: Bool\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = 1"
     ]
    },
...

Note the output is broken across the files. For some kernels, the resulting output is mangled.

If this cell is executed in jupyter notebook or jupyter lab, we see a single output instead:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "x = 42 ..... type: Int64\n",
      "x = 12.0 ..... type: Float64\n",
      "x = 33000.0 ..... type: Float64\n",
      "x = Hallo! ..... type: String\n",
      "x = true ..... type: Bool\n",
      "x = 1 + 2im ..... type: Complex{Int64}\n"
     ]
    }
   ],
...

We were surprised by this behavior, and are wondering if this is expected output for you. Thank you!

@cscheid
Copy link
Author

cscheid commented Mar 30, 2023

I'm noticing now that this produces output that Visual Studio Code's notebook extension can't format well either:

image

Compare that to the execution inside jupyter itself:

image

@cscheid
Copy link
Author

cscheid commented Mar 30, 2023

A user reported that this seems to be caused by calls to flushing the stream, and so a python kernel with this should suffice to repro:

for i in range(1,6):
    print("hello world", flush=True)

@davidbrochart
Copy link
Member

Thanks for reporting the issue.
I cannot reproduce it with ipykernel. It seems to me that the kernels which show this behavior don't implement the kernel protocol well. Maybe they don't send the parent's msg_id in their IOPub messages, although this would likely create other issues.
Anyways, I will try and reproduce the issue with the IJulia kernel.

@davidbrochart
Copy link
Member

Ah my bad, I think it's just a question of "grouping" the outputs per type, I don't think nbclient does that indeed.
I can implement that.

@cscheid
Copy link
Author

cscheid commented Mar 30, 2023

Right. FWIW, I've just implemented a very similar workaround on quarto: https://github.com/quarto-dev/quarto-cli/pull/5042/files#diff-f14392ed1b322ded009b3110fd5501edca659f5d0215c3491dfb26711e699083

@chrisjsewell
Copy link
Contributor

@cscheid
Copy link
Author

cscheid commented Mar 30, 2023

@chrisjsewell Thanks for the added info. I understand it does not, and we now have a workaround on our side. But I'm wondering whether it should (specifically so the resulting .ipynb with multiple outputs render correctly. This comes up not only in quarto, but in jupyter lab, jupyter notebook and visual studio as well).

@davidbrochart
Copy link
Member

davidbrochart commented Mar 31, 2023

I opened #279 with @chrisjsewell's solution in MyST-NB. I just copy/pasted your code, please let me know if it's fine!
Note that if outputs are split over multiple cells, then that's another issue (not fixed in #279).

@chrisjsewell
Copy link
Contributor

I just copy/pasted your code, please let me know if it's fine!

oh yeh absolutely no problem, thanks for asking!

@davidbrochart
Copy link
Member

Thanks, I added you as a co-author.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants