Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32/nimble forgets BLE pairing connection to Android client (nasty 'fix' attached) #16

Open
geeksville opened this issue Sep 9, 2020 · 11 comments

Comments

@geeksville
Copy link

Hi,

I recently noticed one of my nodes was discarding formerly working ble bonded connections to my Android phone. This failure occurred after the following error message/clue was emitted in the logs "failed to configure restored IRK". Searching for that string and then adding some debugging led me to the following lines in ble_hs_resolv_list_add() as the culprit.

    if (ble_hs_is_on_resolv_list(ident_addr, addr_type)) {
        return BLE_HS_EINVAL;
    }

Note: for my application I teardown and then restart the BLE stack fairly often, because my device enters light-sleep and I want to turn off bluetooth during that period. This code and the nimble fixes in espressif/esp-idf#5530 had been working quite nicely for over a month with a couple thousand shipped devices on my open-source project (www.meshtastic.org). This failure started to occur for only one of my personal test devices.

Not knowing much at all about the innards of this stack my nasty hack in this case is to replace
"if there is already an entry found for addr X, fail" with "if there is already an entry found for addr X, replace it"

I have no idea if this hack is a bad idea, but I thought I'd include it here for comment.

(cc @bfriedkin because it sounds like you are doing similar things to me, so if you encounter this bug it might be relevant for you also)

My private debug output listed below:
    BLE task running
    registered service 0x1800 with handle=1
    registered service 0x1801 with handle=6
    registered service 6ba1b218-15a8-461f-9fa8-5dcae273eafd with handle=10
    registered service cb0b9a0b-a84c-4c0d-bdbb-442e3144ee30 with handle=18
    failed to configure restored IRK
    BLE advertisting type=0, Private=0, Device Address: 39:24:62:ab:dd:df
@geeksville geeksville changed the title ESP32/nimble forgets BLE pairing connection to Android client ESP32/nimble forgets BLE pairing connection to Android client (nasty 'fix' attached) Sep 9, 2020
@geeksville
Copy link
Author

cc @bfriedkin - just in case you also encounter this.

@bfriedkin
Copy link

@geeksville -

Thank you for keeping me in the loop. I haven't seen what you describe (most of my testing has been on iPhone), but it is good to know about this nonetheless.

Regards,
Brian

@prasad-alatkar
Copy link

@geeksville Can you please share for which Android device did you test ? If possible can you please share any simple steps to follow which can help me reproduce the issue ? From your description, it looks to be kind of very interesting scenario. The reason ble_hs_is_on_resolv_list returns BLE_HS_EINVAL because ble_hs_resolv_list_add is called for peer already existing in Resolving list i.e. IRK already present with our device. However if I am right with my first cut analysis regarding the issue then we may have to look for better fix than commenting out that section of code :) Can you please provide me debug logs for the issue with attached NimBLE submodule patch :
NimBLE_forgets_bonded_peers_check_IRK_changed.txt ? You may need to apply the patch at $IDF_PATH/components/bt/host/nimble/nimble. You may need to enable debug logs from menuconfig i.e. make menuconfig --> Component Config --> Log output --> Default log verbosity to Debug.

@geeksville
Copy link
Author

geeksville commented Sep 22, 2020 via email

@prasad-alatkar
Copy link

Thanks @geeksville , meanwhile I will try to catch the issue myself.

@prasad-alatkar
Copy link

Hi @geeksville I tried to reproduce the issue with some forced setup. Though I was able to get the error prints ( failed to configure restored IRK), I did not see any issue with re-bonding with earlier bonded peers. Just to give a little background, we use Host based privacy and not Controller based privacy. Now for controller based privacy, when host stack syncs with controller (init-deinit of BLE stack results in multiple syncs ), the IRKs are supposed to be restored in Controller. Now as we use Host based privacy, the above debug/error prints just indicate that the IRKs for the peer are already present in Resolving list so ble_hs_resolv_list_add returns failure. IMO, this should not have any material impact on re-bonding with peers. If possible, kindly share debug logs for the issue, this will help me root-cause the issue if any.

@sicanins
Copy link

Hi,
not sure if this helps. I was also trying to stop and restart the stack and came across these error messages. They dont do any harm as expected.
But as I did not know this, i tried to get rid of them, adding a ble_hs_resolv_list_clear_all(); after shutting it down. This cleared the error message.
But I had to find out, that it left the persistent key storage in some invalid state that blocked any further connections.
Didnt dig deep, just deleted the store completely and it worked again.
But I guess that there might be some check needed to detect a corrupted key storage.

Just deleting all peers with ble_store_util_delete_peer() did not resolve the problem!

Hope this helps,
Simon

@prasad-alatkar
Copy link

Hi @sicanins apologies for very delayed response, as I understand from your reply keeping these error messages as it is (i.e. ignoring them) do no harm, that is what I had mentioned in above reply too.

@geeksville , Can you please take a look at my above reply ? If you still face the issue, please try to provide some logs as mentioned earlier. Thanks.

@h2zero
Copy link

h2zero commented May 24, 2021

@prasad-alatkar I have found this issue as well.

Steps to reproduce:

  • Connect & bond with ESP32 using a phone.
  • Disconnect from ESP32
  • ESP deinitialize NimBLE stack and sleep
  • Delete bond info on phone
  • ESP wakeup and advertise
  • Connect again with phone causing a repeat pairing
  • Pairing will fail and be unable to pair again until ESP reboots.

@geeksville's patch does indeed correct this issue, which seems to be caused by a mismatch of addresses after the above procedure.

@ChrisEAlfred
Copy link

ChrisEAlfred commented Sep 22, 2021

It has been several months. Has this been fixed yet?
Can we have an updated bleprph example which handles RPA? RPA seems to be mandatory on iOS (I think).

jspngh added a commit to ac-frg/esp-nimble that referenced this issue Dec 8, 2021
@VNovytskyi
Copy link

Hello!
I also have the same error. I added the following code to the NimBLEServer.cpp:
image
It fixes the problem, but you must enter the PIN code twice. If someone knows how to fix it, please tell me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants