Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No tests for notebooks, fixture does not work when store = True #274

Closed
krassowski opened this issue Apr 1, 2024 · 9 comments · Fixed by #329
Closed

No tests for notebooks, fixture does not work when store = True #274

krassowski opened this issue Apr 1, 2024 · 9 comments · Fixed by #329
Labels
bug Something isn't working

Comments

@krassowski
Copy link
Member

Description

Heads up to anyone trying to improve test coverage:

I tried to write tests for #270 using rtc_create_notebook and rtc_create_file fixtures. While I was successful with rtc_create_file, tests with rtc_create_notebook appear to stall if using store=True.

I also noticed that this fixture is not used, not even once in the tests. It looks like it needs some attention.

@pytest.fixture
def rtc_create_notebook(jp_root_dir, jp_serverapp, rtc_add_doc_to_store):
"""Creates a notebook in the test's home directory."""
fim = jp_serverapp.web_app.settings["file_id_manager"]
async def _inner(
path: str, content: str | None = None, index: bool = False, store: bool = False
) -> tuple[str, str]:
nbpath = jp_root_dir.joinpath(path)
# Check that the notebook has the correct file extension.
if nbpath.suffix != ".ipynb":
msg = "File extension for notebook must be .ipynb"
raise Exception(msg)
# If the notebook path has a parent directory, make sure it's created.
parent = nbpath.parent
parent.mkdir(parents=True, exist_ok=True)
# Create a notebook string and write to file.
if content is None:
nb = nbformat.v4.new_notebook()
content = nbformat.writes(nb, version=4)
nbpath.write_text(content)
if index:
fim.index(path)
if store:
await rtc_add_doc_to_store("json", "notebook", path)
return path, content
return _inner

It stalls on line 106:

await rtc_add_doc_to_store("json", "notebook", path)
@krassowski krassowski added the bug Something isn't working label Apr 1, 2024
@krassowski
Copy link
Member Author

It looks like the observe event/signal from YNotebook does not fire the callback:

def _on_document_change(target: str, e: Any) -> None:
if target == "source":
event.set()
async def _inner(format: str, type: str, path: str) -> None:
if type == "notebook":
doc = YNotebook()
else:
doc = YUnicode()
doc.observe(_on_document_change)
async with await rtc_connect_doc_client(format, type, path) as ws, WebsocketProvider(
doc.ydoc, ws
):
await event.wait()
await sleep(0.1)

Removing target check does not help - there is no callbacks at all.

YNotebook.observe is implemented in jupyter-ydoc, here:

def observe(self, callback: Callable[[str, Any], None]) -> None:
    """
    Subscribes to document changes.

    :param callback: Callback that will be called when the document changes.
    :type callback: Callable[[str, Any], None]
    """
    self.unobserve()
    self._subscriptions[self._ystate] = self._ystate.observe(partial(callback, "state"))
    self._subscriptions[self._ymeta] = self._ymeta.observe_deep(partial(callback, "meta"))
    self._subscriptions[self._ycells] = self._ycells.observe_deep(partial(callback, "cells"))

I am not familiar with the lower-level codebase enough to debug this further easily. Any thoughts?

@davidbrochart
Copy link
Collaborator

I didn't write these fixtures, so I'm also not very familiar with them.
Regarding YNotebook.observe, there has been changes recently in the way yrs handles them, and I updated pycrdt accordingly. Maybe worth trying again?
I'll try and check on a simple example and report back.

@davidbrochart
Copy link
Collaborator

This shows that observing notebook changes works as expected:

from jupyter_ydoc import YNotebook

ynb = YNotebook()

def callback(target, events):
    print(f"{target=}")
    for event in events:
        print(f"{str(event)=}")

ynb.observe(callback)
ynb.set(
    {
        "cells": [],
    }
)

# target='cells'
# str(event)='{target: [{"execution_count":null,"metadata":{"trusted":true},"source":"","id":"c6506805-eec8-496d-9ec1-28d7239bf0b8","cell_type":"code","outputs":[]}], delta: [{\'insert\': [<pycrdt._map.Map object at 0x7f9a0c5d84a0>]}], path: []}'
# target='meta'
# str(event)='{target: {"metadata":{"language_info":{"name":""},"kernelspec":{"name":"","display_name":""}},"nbformat":4.0,"nbformat_minor":5.0}, keys: {\'metadata\': {\'action\': \'add\', \'newValue\': <pycrdt._map.Map object at 0x7f9a0c5d92b0>}, \'nbformat_minor\': {\'action\': \'add\', \'newValue\': 5.0}, \'nbformat\': {\'action\': \'add\', \'newValue\': 4.0}}, path: []}'

@krassowski
Copy link
Member Author

I think the problem is not that observing does not work when a change is made via set, but that the test fixture assumes that an event is emitted to the observer after establishing an initial connection. I think there might be a difference in behaviour between files and notebooks here.

@krassowski
Copy link
Member Author

It looks like the problem is possibly not with the fixture, but with the logic itself. I see that the content is loaded for the notebook:

WARNING  ServerApp:manager.py:741 Notebook test.ipynb is not trusted
INFO     ServerApp:rooms.py:158 Content in room json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd loaded from file test.ipynb
DEBUG    ServerApp:yutils.py:111 Received SYNC_STEP1 message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:120 Sending SYNC_STEP2 message to endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:111 Received SYNC_STEP2 message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:yutils.py:111 Received SYNC_UPDATE message from endpoint: json:notebook:0d6434b8-9fe3-4c16-8d95-c82927bf05fd
DEBUG    ServerApp:fileio.py:267 Reading path from disk: test.ipynb
DEBUG    ServerApp:fileio.py:267 Reading path from disk: test.ipynb

In code it hits this path:

if read_from_source:
self._emit(LogLevel.INFO, "load", "Content loaded from disk.")
self.log.info(
"Content in room %s loaded from file %s",
self._room_id,
self._file.path,
)
self._document.source = model["content"]
if self.ystore:
await self.ystore.encode_state_as_update(self.ydoc)

I am a bit confused as to why encode_state_as_update is called and the update is not applied later.

I think this might be the error - the state is not set with .set() but by overriding self._document.source directly. This means that the observer callback is never fired.

@krassowski
Copy link
Member Author

It looks like this logic traces back to #2.

It looks like in ytstore implementation this method does apply the update:

https://github.com/jupyter-server/pycrdt-websocket/blob/20c2f388e27badc7eaf6369381be9bec6b0b5c28/pycrdt_websocket/ystore.py#L138-L145

Anyways, that looks like a wrong avenue to pursue. The bug is in the different between YNotebook and YUnicode...

@krassowski
Copy link
Member Author

krassowski commented Jul 18, 2024

I think I got it. There is no .emit() because pycrdt ignores empty updates:

https://github.com/jupyter-server/pycrdt/blob/32a73975edac0a571d7b63512d8e2df88720f136/python/pycrdt/_sync.py#L107-L114

We can trigger the same issue with a text file by passing an empty string (equal to default content):

async def test_get_document_file2(rtc_create_file, jp_serverapp):
    path, content = await rtc_create_file("test.txt", "", store=True)  # will timeout
    collaboration = jp_serverapp.web_app.settings["jupyter_server_ydoc"]
    document = await collaboration.get_document(
        path=path, content_type="file", file_format="text"
    )
    assert document.get() == content == ""
    await collaboration.stop_extension()

@krassowski
Copy link
Member Author

In addition, the target which fires for notebooks is not source but state, meta, cells.

@krassowski
Copy link
Member Author

Changing the fixture to:

@pytest.fixture
def rtc_add_doc_to_store(rtc_connect_doc_client):
    event = Event()

    async def _inner(format: str, type: str, path: str) -> None:

        def _on_document_change(target: str, e: Any) -> None:
            expected_target = "cells" if type == "notebook" else "source"
            if target == expected_target:
                event.set()

        if type == "notebook":
            doc = YNotebook()
        else:
            doc = YUnicode()

        doc.observe(_on_document_change)

        async with await rtc_connect_doc_client(
            format, type, path
        ) as ws, WebsocketProvider(doc.ydoc, ws):
            await event.wait()
            await sleep(0.1)

    return _inner

Appears sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants