Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per-proxy distance sensors for configured beacons #99

Closed
agittins opened this issue Feb 3, 2024 · 17 comments · Fixed by #107
Closed

Add per-proxy distance sensors for configured beacons #99

agittins opened this issue Feb 3, 2024 · 17 comments · Fixed by #107
Assignees
Labels
enhancement New feature or request

Comments

@agittins
Copy link
Owner

agittins commented Feb 3, 2024

(from a discussion I was having with myself while writing a comment)

Create per-proxy distance sensors for each beacon so that I can easily use the history tool to visualise what's happening with the data, including missing packets. The current distance and area sensors are good, but their data makes it clear that:

  • missing packets are causing spurious area switches
  • noisy data makes it hard to choose between nearby areas that are open plan (line-of-sight)
  • having multiple proxies in one area really helps with both issues (graph showing (effectively) area-of-nearest-proxy and distance-to-nearest-proxy over time for my watch):
    image

Currently the distance sensor is showing the distance from the currently "winning" proxy, which makes analysis difficult. By adding a sensor per-proxy I'll be able to see the fuller picture (with the aid of HA's wonderful history graphing) and get visual feedback on improvements with distance filtering. While the raw data is available from the service it's more difficult to visualise.

@agittins agittins added the enhancement New feature or request label Feb 3, 2024
@agittins agittins self-assigned this Feb 3, 2024
@jaymunro
Copy link
Contributor

Working well.
Screenshot 2024-02-28 at 6 06 25 PM
Screenshot 2024-02-28 at 6 06 15 PM

@jaymunro
Copy link
Contributor

However, and you're probably aware already, the sensors seem to have stopped updating since downloading main.
In the graph below I did a restart about 5:37pm to upgrade HA to 2012.2.4 and another restart about 5:57pm to upgrade bermuda from 0.4.3 to main. There've been no updates on any of the distance sensors since the core restart.

Screenshot 2024-02-28 at 6 25 34 PM

A reload of the integration had no effect on the values.
A restart of HA core however kicked them into gear, and they are now updating.

Screenshot 2024-02-28 at 6 38 01 PM

Mentioning this in case someone else experiences it, or it starts reoccurring.

Let me know if there's anything I can do to help with testing.

@jaymunro
Copy link
Contributor

This is actually quite useful. With one of the cars, I've always been seeing a lot of unknowns when it's just sitting there and I assumed missed packets, however the distance sensor to that proxy is showing a distance during an 'unknown' period (typically a few seconds). This suggests that maybe the packets are not being dropped, but that the logic is going above a tolerance level, somewhere between 2-3 metres I'm guessing.
Screenshot 2024-02-28 at 6 43 25 PM
Screenshot 2024-02-28 at 6 42 36 PM

The other car does not show this behaviour, but I believe it has a stronger transmission signal as it varies between only 0.5-0.8m despite being the same physical distance from the proxy.

@agittins
Copy link
Owner Author

That's awesome - you've got more proxies than me, too! :)

Take a look in the configure dialog, my guess is you might have the max radius set to 3. I have mine set to 30 (and 20 is the new default in future releases). I suspect that's why it's going unknown.

Bear in mind that having it high also means that the area sensor won't be a reliable indicator of not home as the area will stay locked to the last area it was in, so best to use the device_tracker entity for that.

Perhaps I should add another option for how long to wait before declaring an area device unknown, I could see benefits doing it either way.

That's interesting re the cessation of updates after restart, I thought I had fixed that :-/ I'll be making changes in that area before I'm ready to release so hopefully it will be addressed there. Did you notice any warnings or errors in your logs at that time?

What's really cool is going to the history page and adding an entire device. So... much... data! 😻

Hopefully that will help me come up with much better algorithms for tracking distance and area - it's already shown how obviously bad the decisions are.

@agittins agittins linked a pull request Feb 28, 2024 that will close this issue
@jaymunro
Copy link
Contributor

jaymunro commented Feb 29, 2024

You're right. I had forgot about those settings. Max radius now set to 4 and much better.

I have found the new code restores updates on the area and distance sensors when the device returns home, even after a HA restart. However if there is a restart, the device_tracker also goes unavailable but does not recover when the device returns home. A Bermuda reload is required while the device is home.

Might I suggest to help prevent devices flipping between proxies while it is stationary, an outlier plus low_pass filter seems to work wonders in the graph. If this was added to the distance logic before the area is decided, it may help.

Here are 4 proxy distances on one of the car's bluetooth beacons.
Screenshot 2024-02-29 at 2 07 40 PM

And here are the distances cleaned up with
[filter: outlier, window: 2, radius: 5]
[filter: lowpass time_constant:2, precision: 2]
Screenshot 2024-02-29 at 2 08 25 PM

Here they are overlaid to show the latency is still low.
Screenshot 2024-02-29 at 2 07 06 PM

Notes:

  • Garage door open/closed has a big effect on the gate proxy with the car in the garage.
  • While in the garage, Max often jumps to the back garden (purple in area sensor). Perhaps these filters would reduce or eliminate that.
  • The gate_opener and gate_light proxies are similar device type (Shelly 1PMmini, Plus 1) and in the same water tight box at the gate. Their orientation is different so the signal polarisation may also affect their reception from the beacon.
  • The change in distance from 2:00 to 2:03 is me driving the car to the gate, waiting 20 secs and driving back into the garage. The cleaned up sensors are much easier to interpret with clear swap over vs. the raw distances jumping back and forth halfway between.
  • I don't have other devices with static MAC or UUID for testing, only iPhones with dynamic MAC. I'm looking into an iOS app that may do what's needed but there is a limitation that the app must be in the foreground and phone unlocked, hence a sparsity of apps and why HA Companion app doesn't broadcast a UUID. https://stackoverflow.com/questions/29418388/ble-advertising-of-uuid-from-background-ios-app
  • The Garage door proxy seemed to have gotten stuck on the return back to the garage around the 2:02 mark, but this was not repeatable.
  • The physical distance from garage to gate is about 30m so the gate_light is way closer than the gate_opener.
Screenshot 2024-02-29 at 2 49 42 PM

@agittins
Copy link
Owner Author

That's great!

You're definitely showing much-improved data from the filtering you've applied.

I agree that if we used the filtered data to make the area decision we'd get far fewer spurious switches between areas. Partly the issue is that currently I look at the recent adverts, and if the current area hasn't seen a packet for a while (currently ADVERT_FRESHTIME = 2.5 seconds) we ignore that reading. The filter would have us continue using that reading regardless of age, which is not entirely a bad thing, but risks a device getting "stuck" in an area if the receiver never received a more distant advert, so I think we will always need to "un-weight" distances if they haven't had recent updates. Looks like for the majority of cases though it will work just fine.

The lag is pretty good, as you note - I expected it to be worse. Due to the asymmetry of the noise, and the nature of when we want low-latency (typically, we want entering an area to be fast, but leaving it can be slow) perhaps we can influence how the filtering treats decreasing values (ie, follow them fast) versus increasing (follow them slow).

The gate_opener and gate_light proxies are similar device type (Shelly 1PMmini, Plus 1)
Their orientation is different

Yes, it could be polarisation, but it could also be the proximity to anything else in/near the enclosure - with a 1/4 wavelength of 3.1cm the amount of variation caused by seemingly small differences can be insane! It does look like it was relatively consistent though - I wonder if that is dependent on the direction of the signal or if it's inherent to the receiver? Ie, if Max were parked a few metres down the street would the gate light still read lower or would the relationship change? I think there might be a lot of variance in receiver sensitivity (which we can calibrate out) but so much of it is also environmental. Like garage doors, as you say!

iPhone

Oh that's a super-interesting link! I'm temped to dig deeper into the background-packet they identified, although the answerer's blog page indicates that iOS 12 changed to using scan-reponse packets for the same info, which makes it less useful I think.

Have you got Private BLE Device set up to resolve the IRK on your iPhone? I'm keen to integrate with it but it'll be a bit of a hassle for me to set up a testing environment here.

@jaymunro
Copy link
Contributor

jaymunro commented Mar 7, 2024

Sorry, been tied up with another project (esphome/firmware#173) and an eye cataract operation which has delayed things a bit.

Interesting point on the asymmetry of RSSI. I can try for a filter that prioritises higher signal strengths.

Have you got Private BLE Device set up to resolve the IRK on your iPhone? I'm keen to integrate with it but it'll be a bit of a hassle for me to set up a testing environment here.

Yep, I've been using Private BLE for many months on 5 phones to verify if anyone is home or not and to lock doors and such when the last person leaves. It's real easy to set up, just needs someone to accept a pairing from my Mac laptop once and then remove it. The IRK remains in the Mac's keyring for searching at leisure. Happy to do some testing when you're ready.

@agittins
Copy link
Owner Author

agittins commented Mar 8, 2024

Oh dang, hope all goes well with the op recovery!

That voice work looks really cool, I don't have any hardware yet (just as well, plenty of stuff to play with already!) but I hope to give the voice stuff a crack when I have the time.

I'm spinning off a new ticket for the issue you noticed re device trackers after a restart that stay unavailable until a reload.

@jaymunro
Copy link
Contributor

jaymunro commented Mar 9, 2024

@agittins Following a comment you made about a filter to take into account the asymmetry of RSSI signal noise and that it is okay to have some latency when leaving an area but have a very quick response when approaching an area, I have been testing a simple MIN filter which just uses the minimum value over the last x samples for each update.

Here are two bluetooth beacons on the cars, Max & Suma:
Screenshot 2024-03-10 at 9 42 31 AM

The receiver (proxy) is the same Shelly for both. Physically, Suma is 2m from the Shelly and Max is 0.8m. Max is a Tesla with built in Bluetooth and antenna. Suma is a MicroTik iBeacon puck taped above the passengers sun visor.

We can see that the Suma beacon is a lot noisier and has a larger variance (1.2m-4.5m) than Max (0.5m-1.3m). Suma has outliers up to about 8m and Max up to about 29m (I have seen one or the other go up to about 60m sometimes).

Here are the signals cleaned up with the MIN filter. Suma is using a sample size of 19 and Max is using 14 (I chose these as the smallest sample_size that would remove 99% of the noise, but lower values may be acceptable). There is a difference in the duration of the 'noise' between the transmitters, hence the different sample sizes needed.
Screenshot 2024-03-10 at 9 42 51 AM

The original and filtered overlaid to see latency:
Screenshot 2024-03-10 at 10 22 46 AM

When Suma was leaving this morning we can see a 16s lag in the filtered trace (makes sense, 19 samples at about 0.9 / sec) but, although hard to see, the trace drops back to the original with ZERO latency if there is a drop in RSSI.

On a side note, a heads up that due to the large number of these distance sensors, each with many attributes, together with the high rate of change, my DB size started going exponential. Fixed in configuration.yaml with:

recorder:
  exclude:
    entity_globs:
      - sensor.*_distance_to_aska*
      - sensor.*_distance_to_back*
      - sensor.*_distance_to_bathroom*
      - sensor.*_distance_to_dining*
      - sensor.*_distance_to_gate*
      - sensor.*_distance_to_hallway*
      - sensor.*_distance_to_kitchen*
      - sensor.*_distance_to_living*
      - sensor.*_distance_to_master*
      - sensor.*_distance_to_office*
      - sensor.*_distance_to_spare*
      - sensor.*_distance_to_zoe*

I didn't exclude the garage sensors as I'm using them for the tests.

You may have less proxies, but worth checking DB size just in case if you haven't already. Aim to keep at least 1.5x DB size as free space, otherwise backup restores can fail and brick your HA (happened to me over the new year just gone).

EDIT I wanted to note that I am still using a max radius of only 4m otherwise the cars start jumping around all over the house and that's not good for the furniture.

@agittins
Copy link
Owner Author

Oooh, great work with the sensor filtering!

Yes, I knew db size was likely to be pumping along :-) I use timescaledb for mine so it's free to chew up a TB or so if it needs it 🤣

I've got changes nearly ready to go which reduces the number of sensors that are enabled by default - so the Area and Distance sensors are the only ones that go active when the device is added, but it adds as "disabled" all the others so one can just enable the sensors one wants for testing/debugging without churning the database and the state engine (and the browser's memory!).

One of those sensors is for raw rssi distance measurements, because...

Inspired by your results I've been experimenting with coding in some filters and I think I've got the beginnings of something:
image

It's probably a bit shoddy mathematically, but basically I am "sampling" the latest reading at each update_interval (so 0.9 seconds or whatever) and doing a moving window average, but I corrupt it by instantly adopting any closer reading we get, and when doing the averaging basically ignoring/plastering over any reading above a "local minima".

I think I'll make the sample count configurable, as it basically defines the slope angle for rising distance. In this the left half is at 10 samples, the right half is 20, and you can see the slope angle is reduced. In effect it's probably just weighting the minimum value more, since I effectively overwrite the history with the lowest point when doing the averaging.
image

I'll be interested to see how it performs on your setup, whether the smoothing is good or if it might need a lot more tuning. I haven't tried it on my production system yet to make sure the way I do the sensors is resource-friendly, but I'm keen to do so ASAP.

There's a significant batch of changes.

Good call re the recorder filtering, too - I'm making a note to copy your snippet into the docs.

Also a very good point re the furniture! 🤣

@jaymunro
Copy link
Contributor

I added it to dev and all looked good so also on production now.

Screenshot here shows before and after the update. The furniture now looks safe! I may increase the radius setting to a higher value. I have it at 6m now but may go higher if I dare, so as to get rid of some of the "unknown"s in the area sensor.
Screenshot 2024-03-16 at 5 32 50 PM

The filter looks good so far. Here's a close up of one car (Suma) returning home, stopping near the living room to let some people out and then driving into the garage.
Screenshot 2024-03-16 at 5 59 42 PM

@agittins
Copy link
Owner Author

Awesome, thanks for trying it out - looks promising.

You might have to swap your living room windows with your garage door for better RF consistency. I'm sure the household will understand.

I think you're right about database size. My 108GB postgres database isn't too worried but people might start sending me invoices for replacing their worn out SD cards :-/

I think having the 1-second updates to the backend data works well because it's cheap (updates only take 0.009 seconds on my production), it makes the math easier for me and allows us to respond quickly for area changes etc, but the sensor updates on the front-end (and db) are too frequent for everyday use.

I think what I'll do is throttle the sensor outputs a bit. They will instantly change to lower values, but will hold off updating for higher distances until a percentage threshold or timeout (eg if 5 seconds elapses, or if the value increases by more than 100%, perhaps). That might give the best of both worlds, with the smoothing going on in the background, and sensors snapping to area approaches nicely, but relaxing about retreats (which if they are at the same time as an approach to a different area should still work out well).

I'll also round the sensor distance values to reduce the "churn" a bit further.

I think I'll apply it to all the distance sensors, but I'll repurpose the config for update-interval for the sensor update timeouts, that way people can tweak the sensor min-update-rate without affecting the backend algo, and can also easily ramp it up to get instant updates while doing testing etc.

Keen to get the next update wrapped and out! Just want to make sure it won't blow out anyone's config too much first :-) Then on to iphone support...

@jaymunro
Copy link
Contributor

I'm still noticing a fair few spikes scratching the furniture. Both Max and Suma are guilty here.
Screenshot 2024-03-16 at 11 48 07 PM

These are the filtered values of course and the unfiltered ones are way out there as you can see here. In the range of 18-35m
Screenshot 2024-03-16 at 11 51 47 PM

I was giving this a bit of thought and realised that most people and cars don't go faster than about 10km/h around the home which is about 3m/s. So short of someone propelling their phone out the window at high velocity, I think a primary outlier filter (i.e. before the avg/min combo) that just ignores impossible values may help.

Maybe something like any value that would be >3m in the last second, > 6m in the last 2 seconds, > 9m in the last 3 sec, etc. Most spikes are gone in about 1-2 secs with some occasionally lasting 3-5 seconds.

So an outlier filter based on a velocity of greater than 3m/s since the last reading that either A. just resets that value to the same as the previous value, or B. trashes the reading. The main idea is to treat those readings as erroneous and not worthy of the airwaves they are pretending to propagate through.

Do you think this may work?

@agittins
Copy link
Owner Author

Yeah, I think that's a good idea. I had been wondering how to limit the slope angle (ie, speed) in the filtering part, but the way you put it makes a lot more sense - examine the reading against the last cycles and reject if it implies too high a velocity. Will need an option to select between European and African, though.

Could you give me a screenshot of the same graph above, but just the slice of time from 11pm to 11:05pm? I'd like to see some of the detail in that area just to see how it was tracking through those jumps.

That bit where Suma popped out to pick up dinner should be fixed now, too - I wasn't expiring old readings, so a device that had gone "out of range" would still show as being that distance away. Not always a problem (esp if one has max_radius set rather low) but it can cause a device to get "stuck" in an area due to an old reading being closer than new readings in a different area.

I think we'll always have instances of scratched furniture though - if an area gets a stronger signal it will (correctly, imo) register it as being the closest proxy. It might be resolvable further down the track with proper trilateration, but in this generation it's about all we can do.

@jaymunro
Copy link
Contributor

Sure. Like this?
Screenshot 2024-03-17 at 10 39 54 PM

@agittins
Copy link
Owner Author

Yep, perfect thanks. Nothing surprising there, which is good. Just wanted to make sure the moving average wasn't doing anything weird in those blips.

So I think applying a max_velocity limit looks like a good improvement. Here's just my watch sitting at my desk and shielding it to get variations (actually it's quite variable anyway, being strapped to a large bag of mostly water):
image

It seems to be successfully ignoring the bigger leaps but still honouring them if the previous reading is old. I have it scan back through the history of readings to find the max velocity for each comparison, so even repeated long distances get ignored if it still would have required a high velocity to achieve it.

I've just hard-coded the limit to 3 m/s, but I'll add a config setting for it I think, since it might be a useful value for end users to tweak given none of the distances are "real".

It's up on main if you wanted to try it out.

@agittins
Copy link
Owner Author

I'm going to open a fresh issue for chatting about filtering and algo ideas now that v0.5.0 is out, and this issue is well-and-truly sorted with that :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants