-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp FlightRecorder API #72
Conversation
Makes sense to me. The concept does feel nicely integrated into the custom resources API. |
Glad to hear it! I'll start working on some controller code then. |
I've added a boolean |
|
That's right. I think One more thing about the |
Cool, that all sounds good. |
I think this PR is complete enough now. Some noteworthy changes:
Both controllers have their own WebSocket connections to Container JFR. They synchronize with a Mutex to perform their operations without interfering with each other. We don't want one controller to connect to a different JVM while the other is still working. Bumped version since this is a breaking change, and generated new bundle. I'll have to build and push all necessary images to Quay once merged. Delete recording functionality is still not there. I plan to implement it using a Finalizer [1]. This PR has already gotten so large, I'll save deletion for a separate one. [1] https://book.kubebuilder.io/reference/using-finalizers.html |
Nice work, I just gave it a read-through and everything looks great. I'm trying to manually test it to verify how it all works and see how it "feels" to use the API, but I'm having issues with the built image. I did
I'm guessing something has (temporarily) gone wrong with the ubi base image used by operator-sdk. I'll try it again later on and take a look around to see if I can find anyone else reporting the same issue currently. |
Okay, so I still see that issue above when I try building current Pulling and inspecting I'm not sure what to do with this information once I do narrow it down however. There must be some reason that the images are being published with that string now empty. |
... I have no idea what just happened. I Some podman bug, then? That was very confusing. |
Okay, but when I do a @ebaron can you confirm if you're seeing any of this? I'm not sure now if what I'm seeing is maybe just something messed up with my podman installation, or something along those lines. |
Building this branch with no local changes but using |
Ah, on a directly relevant note, should |
I'm seeing the same as you. Building with podman gives empty |
Yes, it should be. Should I do that as part of this PR or a follow-up one? |
I'm fine with that coming as a follow-up PR, or maybe just as part of the recording deletion PR you're planning. |
Found the relevant podman bug: containers/podman#5503 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just got through some manual testing and everything looks like it's working perfectly. Awesome work.
Thanks for the review! I created #78 that I'll address along with the delete recording functionality. |
This PR contains some first steps for a new version of the
FlightRecorder
API that addresses some shortcomings of the current version. A few key shortcomings are:In order to delete recordings with the API, it made most sense to me to promote recordings to be individual Kubernetes objects. When the
Recording
object gets deleted, we could simply instruct Container JFR to also delete the recording. TheFlightRecorder
resource will remain, and contain information that is common across all recordings. This makes it a good spot to hold a JFR event listing.The
FlightRecorder
is still created by the operator and responsible for one service, linked by anObjectReference
as before. TheFlightRecorder
is then associated with any number ofRecording
objects created for that service. ConsiderFlightRecorder
->Recording
analogous toDeployment
->Pod
. This association is handled with a label selector, allowing for fast lookups. This requires each newRecording
to be created with a specific label, but in practice this API will be consumed by tools and not directly.The
Recording
spec contains the options necessary to create a new flight recording, and the status is updated by the controller with information from Container JFR about that recording. In order to support manually stopping continuous recordings, we use an approach similar to the scale subresource for deployments. The spec contains astate
field where the consumer sets a desired state of the recording. This should realistically be either running or stopped. The status contains a correspondingstate
field reflecting the actual current state of the recording, and the operator will try to reconcile this with what is requested in the spec.The interesting code can be found in:
pkg/apis/rhjmc/v1alpha2/flightrecorder_types.go
pkg/apis/rhjmc/v1alpha2/recording_types.go
(particularly the spec and status types within)
There is no controller code in this PR to use these modified API types, I just wanted to get some feedback on this idea before proceeding further.