Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sonoff S26R2ZB Smart Plug Zigbee Router devices by ITead disconnect from network after a day or two - Firmware bug or? #10282

Closed
daufinsyd opened this issue Dec 20, 2021 · 223 comments
Labels
problem Something isn't working stale Stale issues

Comments

@daufinsyd
Copy link

daufinsyd commented Dec 20, 2021

Hello :)

What happened

After a day or two running perfectly fine, routers kind of disconnect from the network; trying to execute any command results in timeout.
It happens for all routers I have (6 sonoff s26r2zb). End devices (sonoff snzb02 / 03) (directly connected to the coordinator) don't appear to be affected.

example of failed command:

Zigbee2MQTT:error 2021-12-18 16:30:30: Publish 'set' 'state' to 'PLUG-SF-SALON' failed: 'Error: Command 0x00124b0024c08124/1 genOnOff.on({}, {"timeout":10000,"disableResponse":false,"disableRecovery":false,"disableDefaultResponse":false,"direction":0,"srcEndpoint":null,"reservedBits":0,"manufacturerCode":null,"transactionSequenceNumber":null,"writeUndiv":false}) failed (Timeout - 22057 - 1 - 184 - 6 - 11 after 10000ms)'

After the router "disconnected", most of the time, simply plugging it off and back solve the issue for a day or two.

I tried increasing the TX power but it didn't help, all devices (with either good or bad LQI disconnect).

What did you expect to happen

Devices are working.

How to reproduce it (minimal and precise)

  • Join devices
  • Wait a day or two

Debug info

Zigbee2MQTT version: 1.22.1
Adapter hardware: CC2652RB (from slae.sh, as recommended in the official website)
Adapter firmware version: transportrev 2, product 1, majorrel 2, minorrel 7, maintrel 1, revision: 20210708

Edit: typo

@daufinsyd daufinsyd added the problem Something isn't working label Dec 20, 2021
@daufinsyd
Copy link
Author

It might be related with #10276 but since these are made by another manufacturer I don't know ...

@irakhlin
Copy link

irakhlin commented Dec 21, 2021

I am having the same issue with two different firmwares and multiple different adapters, granted all are CC2652P.

Routers:

  1. Tube's Router/Repeater (CC2652P) (two units)
  2. CircuitSetup's CC2652P2 USB Coordinator (flashed with router firmware)
  3. Sonoff ZigBee 3.0 USB Dongle plus (two units flashed with router firmware)
  4. Zigstar v4 (flashed with router firmware)

The two firmwares I have tried flashing all these devices with are:

  1. https://github.com/Koenkk/Z-Stack-firmware/blob/master/router/Z-Stack_3.x.0/bin/CC1352P2_CC2652P_launchpad_router_20210128.zip
  2. https://ptvo.info/zigbee-configurable-firmware-features/ configured with only the basics in router mode and a ping rate of around 5 minutes.

I will see these devices behave normally for anywhere from 1-3 days consistently pinging in however eventually they will no longer ping in. I usually only notice this when the last seen for all gets toward the 12hour + range. It does seem that they all stop reporting in around the same time but I cannot verify for sure. Running a network map request reports back that the devices are unreachable.
Only error I see for each device:
error 2021-12-20 16:37:27: Failed to execute LQI for 'Basement Router'

I have tried doing a packet capture but in all honesty I am not exactly sure what I should be looking for in this case. Simply power cycling the devices seems to return them to a working state in all cases. I just did this to all of them around 2-3 hours ago, so please let me know if there is anything else I can capture that would help troubleshoot this issue.

It might be related with #10276 but since these are made by another manufacturer I don't know ...

One thing I want to point out, none of my other devices are exhibiting this behavior these include but are not limited to:
KMPCIL_RES005 (KMPCIL)
Xiaomi (MCCGQ11LM, WXKG11LM, WSDCGQ11LM, VOCKQJK11LM)
Philips Hue (9290012607, 9290022166, 9290012573A)
Innr (sp224, sp234)
Smartthings (STS-IRM-250)

The issue only appears with the custom router devices and seems to have been happening fairly consistently across multiple versions of zigbee2mqtt, currently on 1.22.1.

My coordinator is a https://www.tubeszb.com/product/cc2652_coordinator/1?cp=true&sa=false&sbp=false&q=false&category_id=2 with firmware version 20210708

@daufinsyd
Copy link
Author

Great to see that we're not the only one ! :)

I usually only notice this when the last seen for all gets toward the 12hour + range. It does seem that they all stop reporting in around the same time but I cannot verify for sure.
I had the issue yesterday and noticed that 2 routers were still connected. So it definitely doesn't happend for all a the same time (but within 12hours as you said ?).

I also get the same LQI error message as you.

I bought a Sonoff ZigBee 3.0 USB Dongle plus and I'll do test once I receive it. I'd like to provide more debug info but on't know where to start.

@daufinsyd
Copy link
Author

According to HA, the routers disconnect every two days and 5 hours.
Screenshot 2021-12-26 at 17-07-42 History - Home Assistant

I got a new sonoff dongle plus ; I'll try with it.

@daufinsyd
Copy link
Author

I flashed the latest firmware (20211217) for the new sonoff dongle 3.0 plus, 2 days ago and every routers disconnected today. :(
I didn't set the experimental power tx so that shoudln't be the cause but strangely enough, increasing the tx power seems to lower the reability (I made a few test and although the lqi was a little bit better, I had way more 'lqi failed for...' errors).

@daufinsyd
Copy link
Author

So I ordered new plug/sonoff dongle, create a new network from scratch with

  • 1 sonoff dongle 3 plug (factory coordinator firmware)
  • 1 sonoff dongle 3 plug with latest router firmware flashed
  • 3 sonoff plug (1 near the coordinator, 2 near the router)
    After two days, all the 3 plug were unresponsive.

@irakhlin
Copy link

irakhlin commented Jan 13, 2022

@daufinsyd
I can report back that I am still having this issue and it does seem to further point to I believe one of a few things..

  1. These devices are somehow being crashes by some other rogue device.. if this is possible? However only custom devices are affected.
  2. I have tried this with both the router firmwares provided by @Koenkk and by @ptvoinfo any many different devices.
  3. The behavior across the devices is slightly different but the end result is always the same, devices with any of the custom "router" firmwares available will work without issue for anywhere from 24hours -> 4/5 Days, eventually they will stop pinging in and ALL become unresponsive and will not ping or respond to commands. When I say all, I mean 5 "stanard router devices" and 4-5 devices I categorize as "custom" below.
  4. It does seem that my coordinator and almost all the devices in question are using the cc2652p chip( easier to come by?). Outside of the sonoff zigbee 3.0 dongle plus they are using the RFstar cc2652p2.
  5. Unplugging the device from power and plugging it right back in seems to work in resolving the issue until the next crash.

Standard devices tested (using both zstack router firmware and ptvo basic configured router firmware)

  • Tube's Router/Repeater (CC2652P) (two units)
    
  • CircuitSetup's CC2652P2 USB Coordinator (flashed with router firmware)
    
  • Sonoff ZigBee 3.0 USB Dongle plus (two units flashed with router firmware)
    
  • Zigstar v4 (flashed with router firmware)
    

Custom devices (using ptvo firmware, tested configured with 60 second ping. "enable watchdog timer was enabled in some, and not in others to check for any difference".

I created some custom PCB boards based on the designs of the popular coordinator sticks available (tubes router/zigstar/zzh) only with intent of using them as router devices loaded with various sensors (temperature/humidity/voc/ect), currently only using temperature/humidity. These were configured with ptvo firmware, additionally as I had added an additional LED to the board I was able to create a virtual switch in ptvo's firmware that would allow me to control the LED on the board. The intent here was to see if the device would respond to an on/off request when it went into a state of no longer pinging in.

sensor

result: device works perfectly pinging in every 60 seconds for days, seemingly 4-5 days? At a certain point I will notice that temperature data is not updating or changing in home assistant, I check back in zigbee2mqtt and the devices will show that they havented pinging in for X hours. Attempting to flip the virtual switch to trigger the LED on the board produces a timeout in zigbee2mqtt. Unplugging the device from power and instantly plugging it back in seems to fix everything with that device until the next "crash".

I have setup a second testing instance of home assistant with ZHA, will probably also install zigbee2mqtt. I have plenty of zigbee devices I can throw at the test network but I would really need a better understanding from @Koenkk or someone with more knowledge on this to help collect useful debugging data.

@tuffelh
Copy link

tuffelh commented Jan 13, 2022

I can also confirm this behavior with Sonoff Stick flashed with latest router firmware.

@beezly
Copy link

beezly commented Jan 14, 2022

I have a Sonoff usb stick (using the 20220103 firmware) configured as a coordinator, and a bunch of zzh's as routers along with some Sonoff s26r2zb's. Interestingly, I see the same behaviour, but only in the s26r2zb devices. All of the zzhs remain connected to the mesh.

@beezly
Copy link

beezly commented Jan 14, 2022

Some additional info about the S26R2ZBs. I've taken one apart and photographed in the insides (mine is the UK version, but I would doubt there's much variation on these things).

Looks like it is powered by a CC2652 and has some handy solder pads for RX, TX, etc.

PXL_20220114_144408328

PXL_20220114_144257428
PXL_20220114_144345794
PXL_20220114_144221877
PXL_20220114_144157869

@daufinsyd
Copy link
Author

I contacted the Sonoff Support and linked this case, I got a reply :

Dear buyer, the technical department has tested it, this is the problem of mqtt, they have contacted the mqtt staff to deal with it

(I guess they meant zigbee2mqtt and @Koenkk).

@Koenkk
Copy link
Owner

Koenkk commented Jan 18, 2022

@daufinsyd they haven't contacted me about this. Could you make a sniff when command execution fails?

https://www.zigbee2mqtt.io/advanced/zigbee/04_sniff_zigbee_traffic.html#with-cc2531

@daufinsyd
Copy link
Author

@daufinsyd they haven't contacted me about this. Could you make a sniff when command execution fails?

https://www.zigbee2mqtt.io/advanced/zigbee/04_sniff_zigbee_traffic.html#with-cc2531

I buy a CC2531 Stick and I'll post the ouput.
Maybe @irakhlin can you try yo sniff when command execution fails in the meantime ?

@michel-zedler
Copy link

I can also confirm this behaviour.

ZigBee Smart Plug DE Type S26R2ZBTPF
Zigbee2MQTT version: 1.22.1
Adapter hardware: CC2652RB from https://slae.sh/projects/cc2652/

@pvprodk
Copy link

pvprodk commented Jan 20, 2022

I have also experienced the S26R2ZB going offline.
On top of that they dont seem to allow end devices to join (even when not offline), even if I hold the end device right next to the S26R2ZB, the end device still joins directly to the coordinator. If i try to add end devices which are out of range of the coordinator but in range of the S26R2ZB it fails to join. On top of this, the range of the S26R2ZB is very bad, and very low LQI numbers are reported.

ZigBee Smart Plug DE Type S26R2ZBTPF
Zigbee2MQTT version: 1.22.1
Adapter hardware: ZBBridge with Tasmota, EZSP (which is running rock solid in my setup btw)

@vmonkey
Copy link

vmonkey commented Jan 20, 2022

I experience the same issue, FR version of the plugs. I can also confirm pretty low LQI values. They do seem to work as routers though.

@irakhlin
Copy link

@daufinsyd @Koenkk I would be more than happy to do a sniff when the issue happens again. Interestingly enough however since my last update ( 8 days ago) all of my routers are online and showing strong signal without any restarts. I did however rearrange my mesh slightly on that day and removed 2 router/repeaters that had terrible signal strength. Could this issue be related to this? Both devices I removed were cc2652p based routers, could this lead to them attempting to router messages and failing due to signal strength and causing other devices to overflow? I will continue monitoring closely and let you now if there is any change.

@irakhlin
Copy link

@Koenkk It looks like my network is partially back to that same state. It seems half of my routers have not responded in 23 hours (running the standard zstack router firmware). The other half of my routing devices are using ptvo firmware they are all pinging but they do have this feature enabled:
Enable watchdog timer – this option enables a built-in watchdog time in the chip. The timer would reset the device if it froze for more than 1 second. I do not recommend using it in complex and untested configurations with many sensors, and the device may unexpectedly reset.

I am going to get a sniffer out, is there anything specific you need that would help troubleshoot this?

@irakhlin
Copy link

Hey @Koenkk and @daufinsyd

I was able to get a capture of my network. One without attempting to trigger any devices manually and once flipping a couple devices that are alive and a few of the routers mentioned above. I see the message to the router but no response back. I am also not all that familiar with what I should be seeing but it does seem that maybe my network is being bombarded with too many packets? Regardless I put the two captures together with a list of the devices on my network with their status and my key.. What is the best way of providing this to you?

@Koenkk
Copy link
Owner

Koenkk commented Jan 23, 2022

@irakhlin could you put upload the pcapng file + network key here? Also what is the network address of the s26r2zb, you can find this in the z2m frontend)

@irakhlin
Copy link

@Koenkk Here is the capture, in my case the router nodes going out are not s26r2zb but cc2652p boards. The key, a few different captures and a spreadsheet of my network addresses with the affected DOWN nodes is included.
capture.zip

@Koenkk
Copy link
Owner

Koenkk commented Jan 23, 2022

@irakhlin thanks for the logs, checked some sniff and it seems the devices don't send "Link status" messages anymore (which routers should do). I don't expect this to be a z2m bug, rather a bug in the TI SDK. My router fw is using a quite old sdk, I will try to provide an update in the coming days.

@irakhlin
Copy link

@Koenkk I would tent to agree that this is probably an issue in the firmware. But I just wanted to make sure you saw my earlier comments; half of the affected devices are running your router firmware but the other half are running @ptvoinfo (https://ptvo.info) firmware. Additionally none of my "none diy" devices are affect, other than lower AQI when the router nodes crash. I did spend some time trying to test using the new SDK with some changes to the patch file but in that case the devices would join and almost instantly leave the network. This is probably because of something I did however. Let me know if you need me to test the firmware when you get around to it.

@Koenkk
Copy link
Owner

Koenkk commented Jan 24, 2022

@ptvoinfo also uses the TI stack, however I don't know which version.

@vmonkey
Copy link

vmonkey commented Jan 24, 2022

@Koenkk If I understand this correctly, the proper solution would then be to flash an updated firmware onto the S26 plugs? If yes, is there a possibility that this can be done OTA through zigbee2mqtt, or would I need to disassemble the units and flash them? (sorry if these questions are stupid - I am a newbie in all these home automation/hobby small electronic devices).

@ptvoinfo
Copy link
Contributor

ptvoinfo commented Jan 24, 2022

@Koenkk I use "simplelink_cc13x2_26x2_sdk_5_10_00_48".

This problem looks like a memory leak somewhere in the SDK. After some time the firmware cannot allocate memory for current tasks and stop responding.

@jollytoad
Copy link

I'm trying these out atm... https://www.amazon.co.uk/dp/B0B9NF7D22
Although they show as SA-029 in Z2M, they are very basic (no power on behaviour) ... not sure how well they work as routers yet. The packaging has no manufacturer name, no 'Woolley' on there. Zigbee manuf and model is: SONOFF/Z111PL0H-1JX according to Z2M. Appears to have no OTA update ability. You get what you pay for I guess!
I still need a couple more plug, so still on the look out for something potentially more reliable/future proof.
I was considering Innr plug.

@tungmeister
Copy link

I opened a ticket about 2 1/2 weeks ago and I'm yet to hear anything

@eokgnah
Copy link

eokgnah commented Mar 11, 2023

sonoff
Got replacements today... Only had to pay for shipping.
They are:
Firmware build date
20220420
Firmware version
2.1.1
Manufacturer
SONOFF
Model
S26R2ZB

@tungmeister
Copy link

I never got a response to my ticket. Rather disappointing!

@zomar76
Copy link

zomar76 commented Mar 13, 2023

Hello,
I have 3 S26R2ZB with Firmware version 2.0.1 all disconnect from the network as described in this thread.

I'm testing that solution:

  1. from OpenHab at midnight I'm sending
    mqttActions.publishMQTT('zigbee2mqtt/bridge/request/device/remove', '{"id": "0x00124b0024caa5d3"}')
  2. device is reconnecting instantly
  3. all 3 devices are working more then 72h

@tungmeister
Copy link

tungmeister commented Mar 13, 2023

@zomar76 I've just tried this however the test device didn't automatically rejoin unless I enabled joining and as I expected it forgot it's name. If this can all be done with MQTT though this could be set up as part of a script. I'll have a look.

EDIT: I've just set up this script which I'll run as an automation at 4am going forward and see what happens. hopefully, it'll do the trick, as they're not critical devices it doesn't matter at all if they're unavailable for a few seconds at that time. I went for the 35-second delay after a bit of testing the rejoining/interviewing time was consistently completed before that had elapsed.

  script:
    sonoff_zigbee_reset:
      alias: "Sonoff Zigbee Reset"
      sequence:
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/permit_join
            payload: >
              {"value": true}
        - delay: 00:00:01
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/device/remove
            payload: >
              { "id": "0x00124b00258a4187" }
        - delay: 00:00:01
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/device/remove
            payload: >
              { "id": "0x00124b0024c24c48" }
        - delay: 00:00:35
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/device/rename
            payload: >
              {"from": "0x00124b00258a4187", "to": "retropie","homeassistant_rename":true}
        - delay: 00:00:01
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/device/rename
            payload: >
              {"from": "0x00124b0024c24c48", "to": "Garden lights","homeassistant_rename":true}
        - service: mqtt.publish
          data:
            topic: zigbee2mqtt/bridge/request/permit_join
            payload: >
              {"value": false}

@jaanek
Copy link

jaanek commented Mar 15, 2023

@tungmeister How is that script working for you? I have the same plugs, just wondering if there is some workaround to fix the disconnect.

@tungmeister
Copy link

@jaanek They've remained online since implementing it so although it's not a proper fix it's a good enough solution for my use case. I've not had anything in my network go offline at all since implementing it.

@zomar76
Copy link

zomar76 commented Mar 20, 2023

Hello, I have 3 S26R2ZB with Firmware version 2.0.1 all disconnect from the network as described in this thread.

I'm testing that solution:

  1. from OpenHab at midnight I'm sending
    mqttActions.publishMQTT('zigbee2mqtt/bridge/request/device/remove', '{"id": "0x00124b0024caa5d3"}')
  2. device is reconnecting instantly
  3. all 3 devices are working more then 72h

Still working after 10 days.

@SamBouwer
Copy link

@eokgnah did you happen to have time to install and test these plugs with the 2.1.1 firmware? Mine are on 2.0.1 and are quite buggy.

@Piipz
Copy link

Piipz commented Apr 10, 2023

I have a few 2.1.1 firmware and a few 2.0.1 and I can confirm that the 2.0.1 are not working well.
Therefore a question, how do I order 2.1.1, shops never mention it.
Should I try the Itead support first?

@losip
Copy link

losip commented Apr 16, 2023

I can confirm that @tungmeister 's script is working for me as a workaround on three S26R2ZB. (I had bought a pack of four but threw one away convinced it must be faulty. Doh!)

@gmangoesgit
Copy link

Same problem here with firmware 2.0.1
Would love to use the workaround of @tungmeister, but how would you implement such a script? Can that be done in zigbee2mqtt (an extension looks very different) or would this be implemented on homeassistant / openhab or similar?

@tungmeister
Copy link

@gmangoesgit that's a Home Assistant script that's triggered by an automation.

@phillystafford
Copy link

phillystafford commented May 14, 2023

I've a question regarding that workaround script. The plugs stay online with the script but how do the plugs behave as routers? Do all the connected devices which rely on it reconnect?

I've a load of sensors, rad valves that keep dropping off my network because the plugs go offline.

Question 2: if I buy replacement plugs directly from Itead, are they guaranteed to be 2.1.1?

@tppq
Copy link

tppq commented May 17, 2023

I have not observed similar problems, with FW2.1.1 (20220420) , device model S26R2ZB.
Would be curious to know if I am just being lucky or this firmware has solved problems for those who have previously experienced problems. My traffic volume is pretty low though.

@skorokithakis
Copy link

Is it possible to upgrade the firmware if we open up the plug? Is the new firmware available somewhere?

@Hedda
Copy link
Contributor

Hedda commented Jun 27, 2023

Is it possible to upgrade the firmware if we open up the plug? Is the new firmware available somewhere?

Yes it is possible to flash manually CC2652/CC1352 via compatible cJTAG debug probe adapter (for ”c/JTAG which is not same as standard JTAG) using J-Link Commander or similar

https://electrolama.com/radio-docs/advanced/flash-jtag/

It is based on Texas Instruments CC2652P (CC2652P1) so you can build your own unofficial custom image using PTVO Zigbee Configurable Firmware tool

https://ptvo.info/

You first need to figure out inputs and outputs for IO.

@skorokithakis
Copy link

@Hedda that's very useful, thank you!

@ptvoinfo
Copy link
Contributor

@skorokithakis It has RX and TX pads on the small board. Maybe, you can flash the firmware using UART and the serial bootloader feature (I didn't try it).

@phillystafford
Copy link

Has anyone tried flashing Tasmota on one of these to see if that'd be a stable fix? 🤔 I've never flashed Tasmota on a device

https://tasmota.github.io/docs/devices/Sonoff-S26-Smart-Socket/

@skorokithakis
Copy link

@phillystafford I have, but that's for the wifi version of the plug, this issue is about the Zigbee one.

@Hedda
Copy link
Contributor

Hedda commented Jul 10, 2023

Has anyone tried flashing Tasmota on one of these to see if that'd be a stable fix? 🤔 I've never flashed Tasmota on a device

https://tasmota.github.io/docs/devices/Sonoff-S26-Smart-Socket/

Not possible on the Sonoff S26R2ZB model and thus off-topic as that is the Zigbee model and those are not based on an Espressif’s ESP32 or ESP8266 WiFi chip so it can not be flashed with Tasmota firmware (which only works on those Espressif WiFi radio chips and not on Zigbee radio chips as those chios from Texas Instruments or Silicon Labs are).

@hbkhalil
Copy link

Hello everyone, did anyone manage to update the firmware?

@SzymonZy
Copy link

SzymonZy commented Nov 15, 2023 via email

@Tigranvardan
Copy link

Same problem here with firmware 2.0.1 Would love to use the workaround of @tungmeister, but how would you implement such a script? Can that be done in zigbee2mqtt (an extension looks very different) or would this be implemented on homeassistant / openhab or similar?

Should I create new file with File editor or I should put this example in the scripts.yaml ?

Sorry, I'm newbee :)

@Tigranvardan
Copy link

@tungmeister , can you please help me ? Where I should put this script ?

@TheHuskyCrobats
Copy link

As I have same issue with sonoff plug (router disconnecting), can anybody let me know the name of a better zigbee plug with router capabilities? Thank you.
Time to let sonoff go ...

@CHARL13is
Copy link

I'm having the same issue:

  • Sonoff Zigbee USB 3.0 Dongle Plus running 20230507 firmware
  • 4 x Sonoff S26R2ZB devices (mixture of 2.0.0 and 2.0.1 firmware)

Devices listed as offline after several days, power cycling the devices bring them back online. It's sounding like a firmware issue on the plugs itself and unless I'm mistaken there is no other way to flash either the official sonoff firmware or a 3rd party alternative?

@Hedda
Copy link
Contributor

Hedda commented Jun 4, 2024

It's sounding like a firmware issue on the plugs itself and unless I'm mistaken there is no other way to flash either the official sonoff firmware or a 3rd party alternative?

Read the whole thread above as both of those been asked and answered; Yes it is a firmware bug and Sonoff screwed up so OTA flashing is not possible, thus the only way to flash those are manually via a cJTAG programmer / debug probe, see #10282 (comment)

PS: This issue is closed so unless you found a new alternative solution then please start a new discussion thread if have other questions that are not direly related, like asking what other Zigbee Router devices can be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
problem Something isn't working stale Stale issues
Projects
None yet
Development

No branches or pull requests