-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proc.(*Process).Continue skips breakpoints [WIP] #259
Conversation
The code changes look good, but I agree there is some more discussion that needs to happen here.
There are some UI considerations with something like this. Let me organize some thoughts on this, because it warrants some discussion. |
I've also noticed that the test fails on OSX (ironically, by triggering the breakpoint too many times). Either there is a similar bug or my change broke it. |
Thanks to @ignatov I found a bug with my changes to I also found a second race condition in |
@aarzilli please rebase on master. I'm going to review and also work on top of this branch, and continue the discussion above. |
Done, note that this is still on top of #255. |
#255 has been merged. Can you please rebase again, I'd like to work on merging your patches in. |
9d546c3
to
7e95711
Compare
Rebased. |
@@ -91,6 +91,7 @@ func (thread *Thread) Step() (err error) { | |||
} | |||
|
|||
err = thread.singleStep() | |||
pc, _ = thread.PC() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems leftover?
The test |
We also need to decide what to do with API and UI, right now we are counting breakpoint hits correctly but only reporting the first one of each group to the user. |
Please ping me when it will be merged. |
Thinking about this a bit more, I have some thoughts which I would love input on from @ignatov, @ironcladlou and @dlsniper since they are working on actual client integrations. First let me ask, how you you all like this information presented to you? Specifically, since we have the concept of "current thread" and "current goroutine", is it enough for Delve to decide which "current breakpoint" to present to clients? If not, would it make sense to return a slice of current breakpoints? To me that would be confusing. Would it instead make more sense to continue to return information on the "current thread / goroutine" and additionally provide a slice containing all thread state, which includes current breakpoints. That way, you could easily use the "current breakpoint" of the "current thread", or check out all the other threads and see if they have a breakpoint set. Would really like feedback on this, and I'd really like to get this merged in asap. |
Having access to "all" plus an indicator of "current" sounds workable to me. I have no background in debugger internals or debugger frontend development prior to messing with Delve/Sublime, so I'm winging it and happily defer to those who have already thought through these issues. I've been learning a lot from go-lang-idea-plugin, and since IntelliJ's debugger rocks, I normally assume whatever is needed to make the IntelliJ plugin work is probably more than enough for my little Sublime plugin. 😁 |
I have moved Is this acceptable? |
@aarzilli that seems fine to me. |
The tests here seem to consistently fail on OSX. I'll try and dig into it so we can get this merged. |
Added a fix for #262 |
done. |
@aarzilli pulled branch and ran tests several times, getting failures probably 80% of the time on OSX. Are you getting any failures in your OSX VM? |
I see what I was seeing here. A rare failure where the process runs to completion immediately after we start it. It did not happen with the |
@aarzilli update: didn't have time to look into this today. I'll be spending some time tomorrow trying to figure out the issue on OSX, with the goal being to get this merged in right after that's sorted out. |
Rebased on master at ad5097a (no conflicts) |
It seems that the behaviour of |
@aarzilli interesting. I'm still trying to track down OSX issues, however I haven't had much time recently to dig into it. I may actually have time today, so let's hope I can figure something out and we can get this merged in. Unrelated, but I've also been working on implementing function calls. That work is sitting on a branch (maybe I'll send a WIP PR just to publicly track the work). I haven't touched it in a week or so (no time) but it was working in a very limited case, but there were things to consider that I wasn't sure about, which maybe would be another good reason to submit a WIP PR and begin discussion around a few things. Anyways, hopefully we can address the changes on tip and the OSX issues with this PR and keep pushing forward. |
@aarzilli leaving these traces here in case they jump out to you for any reason:
AND
Those are the panics I get. I also get errors like "expected to be at line 11, was at line 10", e.g.:
The other failure condition I see is that the test will just run forever. Also, occasionally it passes. Do you see any of these errors running bare metal on Linux? If not, I suspect it may be hidden in the C code (or more concretely the interfacing between Go and C) that is causing the issue, since that would be the major difference there since on Darwin we call into C a lot of syscalls and such. I'm going to keep debugging this for a bit and see if I get anywhere but I figured I'd leave those traces around for you or anyone else who may want to take a stab at debugging this on OSX. |
@aarzilli can you rebase this on master? I'm going to spend some time tomorrow in this branch, trying to fix the OSX issues. |
Done, did you have the time to check if it's the same weird launch race condition that's 'fixed' by the |
@aarzilli yeah I did look into that but it didn't seem to fix it. I think I'm running into another issue, but I think there may still be a race somewhere between Delve and the debugged process manifesting itself oddly. Looking at the stack traces above, the panic always happens in the same spot, when we're trying to parse the G struct, I just haven't had much time to actually dig into it, but I'm going to start now and see what I come up with either today or tomorrow. |
Breakpoints are skipped either because: 1. when multiple breakpoints are hit simultaneously only one is processed 2. a thread hits a breakpoint while another thread is being singlestepped over the breakpoint. Additionally fixed a race condition between Continue and tracee termination.
Next sets its temporary breakpoints with the condition that they must only activate on the current goroutine, and then calls Continue When Continue encounters a temporary breakpoint it clears all the breakpoint. User visible changes: breakpoints that get hit while executing Next are not ignored. This commit does not implement full conditional breakpoints functionality, the only condition that can be set is on the goroutine id. Fixes race conditions in Next affecting TestNextConcurrent.
resume loops in continueOnce moved to a OS specific resume function, this makes the problem easier to deal with and seems to be more appropriate to a windows port given what transpired from discussion of Pull Request go-delve#276
Sometimes after PtraceSingleStep the thread does not advance of a single instruction but is, instead, blocked immediately by a SIGSTOP Made singleStep repeat the process until a SIGTRAP is observed. Unsure where the SIGSTOP comes from.
added the stuff, rebased on master. |
This is a fix for the problem unearthed by PR #255, it is submitted for discussion, not definitive.
There are two problems with
(*Process).Continue
Halt
eacho one the second goroutine may hit the breakpoint too, this second hit gets lostContinue
resumes the process has a race condition with the traceeLet's that thread A hit the breakpoint and thread B was sleeping when
Continue
calledHalt
on it. After we finish processing the breakpoint on A we callContinue
again,Continue
then loops through all threads, it callsresume
on B, then moves onto thread A, sees that it's on a breakpoint so it calls(*Thread).Step()
.If thread B happens to get to the breakpoint while
(*Thread).Step
on thread A is happening the breakpoint gets skipped, because we temporarily cleared it.You can test this by reverting the changes and running
TestBreakpointCounts
. The fact that multiple breakpoints can be hit simultaneously needs to be propagated all the way to the client, either by changing CurrentBreakpoint in State to a slice or by returning one breakpoint at a time in Debugger.