Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: RPC WebSocket disconnects if protocol is > ~1 MB #6159

Closed
nickcrider opened this issue Jul 16, 2020 · 3 comments · Fixed by #6229
Closed

bug: RPC WebSocket disconnects if protocol is > ~1 MB #6159

nickcrider opened this issue Jul 16, 2020 · 3 comments · Fixed by #6229
Labels
bug robot server Affects the `robot-server` project robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). rpc This involves Opentrons' deprecated RPC system.

Comments

@nickcrider
Copy link
Contributor

nickcrider commented Jul 16, 2020

Overview

A seemingly valid PD protocol will cause the app to throw a ACK Timeout error when uploaded to the robot. No errors can be observed in the robot logs.

Update:
This seems to be caused by the robot server ending the real-time connection if any given message is greater than 1MB. In this case, the protocol itself is larger than 1MB, so the upload command causes the connection to end unexpectedly

Current behavior / Steps to reproduce

  1. Unzip and upload ACK_Timeout_PD.zip to the robot on 3.19.0. My robot is on wifi, have not tested on USB.
  2. You'll get this error after a few seconds
    image
  3. No errors are thrown when looking at journalctl -f
  4. The JSON protocol will simulate successfully with opentrons_simluate.
@nickcrider nickcrider added bug protocol designer Affects the `protocol-designer` project robot server Affects the `robot-server` project labels Jul 16, 2020
@SyntaxColoring
Copy link
Contributor

I could also reproduce this running the robot server locally with make dev.

@mcous mcous added robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). rpc This involves Opentrons' deprecated RPC system. labels Jul 28, 2020
@mcous
Copy link
Contributor

mcous commented Jul 28, 2020

Upon investigation, two problems:

  • The app, for some reason, will only show a "lost connection" alert if the health status of the robot in question is not ok
    • So, if the robot is responding on its health endpoint and the websocket goes down, the websocket failure will not be surfaced to the user
    • This is a definite app side bug
  • The protocol, reproducably, causes the websocket to disconnect

So the summary of events is:

  • App sends RPC call to create a protocol with the contents of this JSON file
  • WebSocket goes down pretty much immediately
  • App misses the fact that the WS is down, hits its 10 second command acknowledge timeout, and surfaces that error
  • Robot is disconnected but user was never alerted and app still shows robot as connected

Eventually, these logs show up in the robot server:

2020-07-28 14:16:02,374 robot_server.service.legacy.rpc.rpc ERROR [Line 161] While reading from socket:
Traceback (most recent call last):
  File "./robot_server/service/legacy/rpc/rpc.py", line 156, in handle_new_connection
    msg = await socket.receive_json()
  File "/Users/mc/.local/share/virtualenvs/robot-server-tntD7pR_/lib/python3.7/site-packages/starlette/websockets.py", line 98, in receive_json
    self._raise_on_disconnect(message)
  File "/Users/mc/.local/share/virtualenvs/robot-server-tntD7pR_/lib/python3.7/site-packages/starlette/websockets.py", line 80, in _raise_on_disconnect
    raise WebSocketDisconnect(message["code"])
starlette.websockets.WebSocketDisconnect: 1006

A 1006 is a generic error code that means the connection was closed abnormally

The app's dev tools report the upload command message to be 1171956 bytes long (~1.2 MB)

@mcous mcous changed the title bug: ACK timeout is raised in the app for a PD protocol bug: RPC WebSocket disconnects if protocol is > ~1 MB Jul 28, 2020
@mcous
Copy link
Contributor

mcous commented Jul 28, 2020

We seem to have a problem with our WebSocket server. If any given message is > 1 MB, the socket is closed

This ticket will track this bug. I will file a new ticket for the app missing WebSocket disconnects when the robot is otherwise healthy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug robot server Affects the `robot-server` project robot-svcs Falls under the purview of the Robot Services squad (formerly CPX, Core Platform Experience). rpc This involves Opentrons' deprecated RPC system.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants