Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequential PLM Scenes May be Ignored #26

Closed
krkeegan opened this issue Dec 17, 2012 · 4 comments
Closed

Sequential PLM Scenes May be Ignored #26

krkeegan opened this issue Dec 17, 2012 · 4 comments

Comments

@krkeegan
Copy link
Collaborator

This is a weird one, and I am still diagnosing it, but this is what I believe is happening so far.

Issue:

If two PLM scene commands are issued sequentially, the second command may never be sent. In this scenario, each Scene controls only one device. Specifically, they are surrogate scenes for KPL buttons. The basic print log shows something as follows:
PLM_Scene_1->set('on')
Received acknowledgement from PLM_SCENE_1
PLM_Scene_2->set('on')
Received acknowledgement from PLM_SCENE_1 (This is not a typo, the print log shows the first scene as acknowledged twice.)

Current Theory:

Setting the insteon debug level to 3 provides some additional detail. If left at the default settings, it looks like the PLM_Scene_1 command is issued and an acknowledgement is received. MH then tries to send the PLM_Scene_2 command but receives a "PLM is extremely busy" error 15 and claims that the command will be retried in 1 second.

While waiting to retry the command, MH receives a command from the PLM. The PLM reports this as being a second acknowledgement from PLM_SCENE_1. (I suspect at this point that MH erroneously deletes the pending command for PLM_Scene_2 that is waiting, believing that this command has been acknowledged).

If I up the Insteon_xmit setting to .7 seconds. Everything seems to work fine. The Insteon:3 debug log shows the following general sequence:
PLM_Scene_1->set('on')
Received acknowledgement from PLM_SCENE_1
Received All_Link Cleanup success from PLM_SCENE_1

PLM_Scene_2->set('on')
Received acknowledgement from PLM_SCENE_2
Received All_Link Cleanup success from PLM_SCENE_2

So I am not an Insteon guru, but it looks like we should be expecting two acknowledgments to PLM Scenes (at least this specific type of scene). While the first acknowledgment is specific to the one linked device. I believe the second acknowledgement is basically saying "All of the linked devices in this scene have reported in successfully."

I suspect that with the Insteon_xmit setting at the default level, that the All-Link Cleanup success message is why the PLM is busy when we attempt to send the PLM_Scene_2 command.

As noted above, I further suspect that at the default setting, the All-Link Cleanup success message causes MH to delete the pending PLM_SCENE_2 command.

Solutions:

  1. Fix the code so that MH only removes a pending message if it receives a correct acknowledgement. Right now it looks like the comparison is not working perfectly.
  2. If PLM Scenes will always receive an ALL-Link Cleanup Success message, maybe we should wait for it to come in.
  3. Some combination of 1 and 2.

Status:

I am still working on diagnosing where in the code this all happens, if anyone smarter than me knows where to look or has some suggestion, please let me know.

krkeegan referenced this issue in krkeegan/misterhouse Dec 20, 2012
Previously, insteon cleanup messages received by MH, would cause MH to clear the active message
withoug confirming if the cleanup message corresponded to the active message.

If multiple PLM scenes were sent sequentially, a cleanup message from one scene may have caused
MH to clear out the following scenes command.

This patch only clears the active message if the cleanup message matches the active message.
@mstovenour
Copy link
Collaborator

Great catch, I think I've seen this too when sending two commands in a row to the same device. Both the existing code and the proposed fix concern me. I'm concerned that the existing behavior tried to remove a message that hasn't been sent. IMO, this should never happen and should be part of the check. At the same time I agree that the response matching logic needs to be stronger (to cover my case). I think this should be fixed in two ways. We could add (if it doesn't already exist) a "sent" flag in the insteon message object that is false initially and set when the message is sent. Then the insteon message object should have a match_ack() function. This function returns true if the ack matches the message and "always" returns false if the "sent" flag is false (to cover your case). The matching logic in the match_ack() function is otherwise very similar to what you wrote. We could then call the match_ack() function with the ack message while scanning the message stack. Would you be willing to try to fix it this way? I don't mind giving it a go otherwise.

@krkeegan
Copy link
Collaborator Author

I agree that checking to see if a message has even been sent would be a nice addition. However, at the moment, it is not on the top of my list to add to the code. I am not the world's greatest perl programmer and I am struggling as it is to figure out the structure of greg's design.

Even without checking if a message has been sent, I think my proposed code still works. In order for a message to be cleared, it has to be from the same device, for the same group, with the same command. A message which has not been sent which matches these criteria would be a duplicate and unnecessary anyways.

A quick solution I can think of, is to check the send_attempts. However, send_attempts iterates when the PLM reports a busy code "15". So a number greater than 0 is not necessarily a sign that a message was sent. We could alter this behavior, but it could result in an infinite loop on the off chance that the PLM continually fails and responds only with busy codes.

mstovenour, if you have the ability to add your suggested code, please by all means do so. Otherwise, I will mark this as an issue for me to come back to in the future.

@mstovenour
Copy link
Collaborator

I'm still wading through the code to get a better feel for how it works and haven not come to a well formed conclusion for this issue. I will admit that the code you introduced is worse case benign; I don't see any harm in the changes. However I still think maybe something more interesting is going on. The 2412U PLM dev guide from 2007 (i.e. covers i1 and i2 but not i2CS) uses the term "all-link group" for a mh "scene". Apparently there are a number of things that happen when we ask the PLM to send an all-link group command. 1) PLM sends an all-link group broadcast towards the responders. 2) the PLM then sends individual clean up messages to each linked responder. 3a) each linked device that gets the all-link clean up will send an all-link clean up ack (this arrives at MH as an insteon message plm command 0x50). 3b) for each responder that doesn't respond to the PLM's all link clean up, the PLM will send a All-link clean up failure report (0x56) to MH. 4) Finally the PLM will send an all-link cleanup status report (0x58) to MH with either ACK(06) or NACK(15). ACK indicates that the clean up completed. NACK indicates that the clean up was aborted due to other insteon traffic. The interesting part is that the code to process this last PLM command has a semantic error. The author tried to use == to compare a string. I'm not sure if this is part of your issue or not. I'll come back to this once I get scenes working with my I2CS devices. Right now I can't test any of the PLM all-link logic.
lib/Insteon_PLM.pm line 648:
if ($self->active_message && $self->active_message->isa('Insteon::InsteonMessage')
&&($self->active_message->command_type == 'all_link_send'))

@krkeegan
Copy link
Collaborator Author

krkeegan commented Jan 3, 2013

Good catch Michael, I hadn't noticed the attempt to compare a string with ==.

I tested it out and this by itself doesn't solve the above issue. This is because, best I can tell the message that I found to be the cause of the problem is interpreted as a "standard insteon message" and is handled prior to reaching this logic.

Best I can tell the short all_link_clean_status message is simply dumped in the current code. Correcting the logic to use "eq" doesn't seem to have any effect because the current code doesn't really do much with the cleanup messages. I think Greg intended on coming back to this issue and never did.

hollie added a commit that referenced this issue Jan 13, 2013
…df74f1fb

Fix for Issue #26. Mered after krkeegan has been running this fix for 3 weeks without issues.
hplato pushed a commit that referenced this issue Feb 2, 2022
hplato pushed a commit that referenced this issue Feb 2, 2022
This reverts commit ca118e5, reversing
changes made to d096140.
@hplato hplato mentioned this issue Aug 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants