Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Significantly reduce heartbeat overhead #301

Merged
merged 10 commits into from
Dec 14, 2022
Merged

Conversation

bdraco
Copy link
Member

@bdraco bdraco commented Dec 10, 2022

The heartbeat built into aiohttp tries very hard to deliver exact timeouts by canceling the timer handler every time a message is received. When the volume of messages is high (bluetooth), this has a significant overhead. Since we do not need that level of resolution with the heartbeat we can use a significantly more performant heartbeat algorithm.

Before this change resetting the heartbeat is ~38% the execution time to receive a websocket message.

Before:
rec_before

After:

The cancelation of the non-existent heart beat still takes ~4% of the run time but still much better

rec_after
Screenshot 2022-12-09 at 8 54 09 PM

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

I think this is good to go, but I need to spend a few more hours doing testing with it

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

Test Plan:

  • Plan 1
    With bluetooth scanner enabled
    Unpower device, verify it sees it as unavailable
    Power device, verify it becomes available
    Unpower device, verify it sees it as unavailable
    Power device, verify it becomes available

  • Plan 2
    With bluetooth scanner enabled
    Unpower device, do not wait for it to be seen as unavaialble
    Power device, verify it still works
    Unpower device, do not wait for it to be seen as unavaialble
    Power device, verify it still works

  • Plan 3
    With bluetooth scanner disabled
    Unpower device, verify it sees it as unavailable
    Power device, verify it becomes available
    Unpower device, verify it sees it as unavailable
    Power device, verify it becomes available

  • Plan 4
    With bluetooth scanner disabled
    Unpower device, do not wait for it to be seen as unavaialble
    Power device, verify it still works
    Unpower device, do not wait for it to be seen as unavaialble
    Power device, verify it still works

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

Found an unexpected issue unrelated to this PR. We never get UpdateType.INITIALIZED when the device reconnects so we never restart the ble scanner. I'll fix that in another PR

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

We could also trigger a refresh when we see the device via zeroconf to speed up reconnects if its offline for a while

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

So there is a race when we reconnect.

If its offline for longer than one cycle INITIALIZED never gets fired and we never reconnect the scanner

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

That problem is existing and not related to this PR. Digging into that though now

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

2022-12-12 10:40:28.786 WARNING (MainThread) [aioshelly.rpc_device.device] 192.168.106.39: Finished init, _update_listener=None initialized=True
2022-12-12 10:40:28.786 INFO (MainThread) [homeassistant.components.shelly] Fetching shellyplugus-c049ef8c1f84 data recovered

So the _update_listener is None when it recovers

@thecode
Copy link
Contributor

thecode commented Dec 12, 2022

@bdraco I was going to ask you about the UpdateType.INITIALIZED as we don't have coverage for this in Home Assistant Shelly

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

So when we get a connect error we call shutdown which sets the _update_listener to None

        except (*CONNECT_ERRORS, RpcCallError) as err:
            self._last_error = DeviceConnectionError(err)
            _LOGGER.debug("host %s: error: %r", ip, self._last_error)
            if not async_init:
                await self.shutdown()
                raise DeviceConnectionError(err) from err

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

So even worse the device just stops getting updates after it hits a connection error

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

Its not just the scanner. I stop getting state updates from the device completely but I can still control it.

@bdraco
Copy link
Member Author

bdraco commented Dec 12, 2022

I'm going to start another PR to fix that.
#307

@bdraco bdraco marked this pull request as ready for review December 12, 2022 21:01
@bdraco
Copy link
Member Author

bdraco commented Dec 14, 2022

Retest looks good with the other fix

@bdraco bdraco merged commit e3020ff into main Dec 14, 2022
@bdraco bdraco deleted the reduce_heatbeat_overhead branch December 14, 2022 00:44
bdraco added a commit that referenced this pull request Aug 8, 2024
bdraco added a commit that referenced this pull request Aug 8, 2024
bieniu pushed a commit that referenced this pull request Aug 11, 2024
* Revert "Significantly reduce heartbeat overhead (#301)"

This reverts commit e3020ff.

* fix merge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants