-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endstops on a different MCU to stepper motors #3832
Conversation
…CU to the steppers
Interesting. Thanks. FYI, I recently purchased some Huvid boards and have also been working on this. (I've been researching the work, but have not yet made code changes.) FWIW, my high-level plan was the following:
One of my high-level goals would be to try and put most of this logic into klippy/extras/homing.py - ideally limiting the impact to the toolhead, stepper, probe, etc. code. The recent move of the homing.py code from klippy/homing.py to klippy/extras/homing.py was in preparation for work on this feature. Of course, it's possible this may not be practical. Your thoughts? |
It's good to hear it's something you are considering supporting at least :)
Yes, probably, I think I was still struggling to wrap my head around the homing code when I implemented that part.
A more general solution would probably be good, yes. The main reason I added the "stepper group" thing is that I am running a quad motor gantry and it is was easier/better to ensure all 4 Z stepper motors stop simultaneously rather than try to correct afterwards (originally I just added a command to call stepper_stop in stepper.c and then discovered each motor was overshooting a slightly different amount of steps when z-probing).
Right, I was a bit worried about how safe that would be considered to be. e.g. if the raspberry pi crashes/stalls and the MCU still has a lot of steps queued so I went with the option of making sure there are never more steps queued than are considered to be safe. In fact, in my tests I have my Y endstop set up only 1.4mm away from the true limit and am able to home at 50mm/s with this code. That gives a window of about 28ms which is almost the 30ms you quote :) it seems this is quite close to the edge of what my raspberry pi 3b+ can handle though. I would be concerned about what happens if you are interacting with octoprint or octodash at the same time.
In fact, no. This is just a small refactoring to note_homing_end: instead of directly adjusting _mcu_position_offset in this function I moved it into set_mcu_position instead. Then instead of changing _mcu_position_offset I made it call set_commanded_pos which has the same effect: get_mcu_position will now return the value that was reported by the MCU but in addition it means that get_commanded_pos will now also return a valid position instead of it being invalidated by the homing operation. This slightly simplifies the code in homing.py meaning it no longer has to track MCU steps and can query get_commanded_position to get the position where the steppers have stopped. I have not made any changes to the way steps are generated or how moves are performed, I've simply made it record where the steppers finished including any overshoot so that get_mcu_position and get_commanded_position will both return that position. When the homing code performs its retract operation for the second homing attempt this will undo that overshoot. A further move could be scheduled if you want to undo overshoot from the second homing.
My thoughts here were that if something has gone wrong/timed out then the machine may now be in an unsafe state with motors that have driven past their intended limits and it should require user confirmation before anything further occurs.
I tried to bear that in mind while writing this code. The only place I think needs special handling (which I commented on) is when to know a motor has definitely stopped and you no longer need to consider it when limiting the drip feed rate.
The changes ended up being quite far reaching so if more of the homing logic could be brought together into that module it would help clean it up and make it easier to follow. I did find it quite hard going figuring out how it all fit together, like why MCU positions were being used pretty much only during homing, what the 'tag'/'calc' position stuff was all about - maybe some clarification in the developer documentation could help here. A lot of that is just down to me coming at it from never having seen the code before and not knowing how all the move buffering works etc, what this 'flush delay' is for etc. It also seemed slightly strange to me that there is a separate drip feeding system only used during homing but it was quite fortuitous as this was ideal for my needs (any reason this is not always in effect, but maybe with larger chunks when not homing?) I have an idea in the back of my head about being able to delegate the drip feeding to an MCU if the raspberry pi is not able to manage the realtime requirements of it in this situation. This might not be too bad with MCUs connected on a CAN bus such as huvud and duet 3 tool board if the main board is also on the CAN bus. (In my current setup I have the option of connecting my 2nd MCU to my 1st if I implement some code to forward comms from klippy. I actually have the 1st MCU connected to the pi by both UART and SPI so that I could just tell klippy to open the /dev/spidevX.X to talk to the 2nd MCU) Final point: the regression tests are failing due to the addition of 'report_ticks' and 'triggered_time' in the end stop commands. Do you think it would be possible/better to add separate commands for the blind homing endstops so the old commands and tests still work? I haven't looked at how the tests work at all or what the process is for regenerating the data if something like this changes. |
Yes - the homing code is quiet complex in Klipper. I struggle with it also. In general, homing is difficult because there are so many different printers and different ways to "home". The code itself could likely also be improved.
The Klipper host python code can have some high latency. However, the host serialqueue.c C code is very responsive. I'd guess less than 1ms of added latency in the normal case. There's still the worst case scenario, but in that case my plan was to have the micro-controllers detect the loss of "continue to home" messages and stop the steppers on their own.
That is what I was referring to earlier. The above is not robust - it can lead to stepcompress errors in the general case. For example, if one Z motor needs to move 5.005mm on the retract, and another Z stepper needs to move 5.020mm, then the iterative solver code will start the moves with an instantaneous movement followed by regular movement - it does not average out the overshoot over the entire move.
There's two goals with that - one is to keep units in millimeters and seconds where possible (it's easier to reason about the code without many unit conversions) and the second goal is to handle the many queues that are in place. There's some info on the queues at https://www.klipper3d.org/Code_Overview.html#coordinate-systems . Basically, different parts of the code have a different understanding of "current position" depending on which part of the queue they work on. The "mcu position" isn't really needed, except when we need to determine exactly how many steps have been taken.
Under normal circumstances, Klipper calculates in advance the exact time each move will take. It can therefore generate the steps for a move, queue them up, and go on to other tasks (like processing and scheduling more movement commands). Due to the nature of the G-Code protocol, we have to queue movement commands and perform lookahead on them in batches - we can't read a command, generate its steps, read the next command, etc. . In contrast, when homing, the code does not know how long the move will take (the endstop could be triggered at any point) and we must not process new commands until the home is complete. So, the "drip move" system was added to handle this special case. If you look at the normal
FYI, if the synchronization is done in the host C code, I don't think it will be necessary to implement synchronization in the micro-controllers. In my experience, the host C code is very responsive.
If you look at the python code, you'll see various checks for -Kevin |
That's good to know.
Right, that's why I had to add the stepper group stuff. I think it's probably ok to enforce that steppers that have stay in sync need to be on the same MCU and maybe add special handling for that case later if required. Sounds like implementing some kind of keep-alive system on the serial thread is the way to go and maybe that will simplify the logic a bit too. |
So, as a high-level question, how would you like to proceed on this? I'm planning to try out some of the ideas I mentioned above, but I'm probably a few weeks out from having feedback with the results of those tests. -Kevin |
If the config file had a parameter for "number of mm past the endstop switch where damage will be done", then all guesses about latency could be removed from the code... You can simply make sure that contact is had with both mcu's at least every that distance. For most regular microswitches that number is 5mm or more, so homing should be fast even with slow python in the loop. Even more precise switches it still tends to be at least 1mm. |
I've got a few other things to sort out on my printer first but I'll probably take another look at this once I'm back up and running. What's here already does work for me (most of the time) but it would be nice to clean it up and move the time critical stuff into the C code. I'll let you know when I've taken another pass at it. @Hello1024 - that is exactly what this change does. There is no guessing: it uses the position_endstop and position_max/min values from the config to determine the safe amount to move. |
Any further updates on this? FYI, there has been similar work on PR #3956 . -Kevin |
Thanks. I'm going to close this as it looks like it has been superseded by #3956. -Kevin |
Draft pull request just to ask for some feedback/discussion on these changes.
The idea of this branch is to facilitate non-contact end-stops/probes attached to an MCU on a toolhead to reduce the amount of moving wiring points (no need for endstops wired at the end of moving rails/gantries). My prototype board design is here for reference.
There are two main parts to these changes:
This is just a proof of concept and it needs a lot of clean up and testing.