Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating KCP ready status after creating required resources #66

Closed
jds9090 opened this issue Dec 7, 2023 · 3 comments · Fixed by #69
Closed

Updating KCP ready status after creating required resources #66

jds9090 opened this issue Dec 7, 2023 · 3 comments · Fixed by #69
Assignees
Labels
bug Something isn't working
Milestone

Comments

@jds9090
Copy link
Contributor

jds9090 commented Dec 7, 2023

CAPI refers control plane's ready status and updates cluster.status.controlPlaneReady.

I want to make sure why KCP(Kamaji Control Plane)'s status changes to the ready state before creating required resources such as workloads' kubeconfig and CA secret. I think the main role of KCP(kamaji Control Plane) is to create TCP(Kamaji Tenant Control Plane) and the required resources. Shouldn't KCP(Kamaji Control Plane) be considered in a ready state after the required resources are created?

For example, CAPI KCP(KubeadmControlPlane)'s status changes to the ready state when there is more than one control plane node which is in ready state and can communicate with CAPI. It makes sense to me because of the main role.

Also, I wonder if there is some logic or a field of CRD to verify whether communication between the workload and mgmt cluster is possible by the kubeconfig or if there is a need to that in terms of TCP(Kamaji Tenant Control Plane) or KCP(Kamaji Control Plane).

For example, KCP(Kamaji Control Plane) or TCP(Tenant Control Plane) may include a field called API SERVER READY which can be checked and updated regularly by the operator.

@prometherion
Copy link
Member

I'm a bit confused about this, so I tried it, and this is my resulting output.

$: kubectl get kamajicontrolplanes.controlplane.cluster.x-k8s.io -w
NAME                       VERSION   INITIALIZED   READY   AGE
capi-quickstart-kubevirt   1.23.10                         0s
capi-quickstart-kubevirt   1.23.10   false         false   0s
capi-quickstart-kubevirt   1.23.10   true          false   0s
capi-quickstart-kubevirt   1.23.10   true          false   0s
capi-quickstart-kubevirt   1.23.10   true          false   2s
capi-quickstart-kubevirt   1.23.10   true          false   2s
capi-quickstart-kubevirt   1.23.10   true          true    13s

These are the events we're managing:

I want to make sure why KCP(Kamaji Control Plane)'s status changes to the ready state before creating required resources such as workloads' kubeconfig and CA secret

We could improve this, maybe shifting the Ready check after the retrieval of the CA and the admin kubeconfig, seems legit.

@jds9090
Copy link
Contributor Author

jds9090 commented Dec 13, 2023

Cluster(CAPI CRD) has a field called controlPlaneReady. This field is set by KCP(kamaji control plane)'s Ready state. In terms of Cluster, If KCP(kamaji control Plane) is ready, Control Plane should be ready. But it may be not ready because the fact that KCP(kamaji control plane) is ready doesn't mean control plane is ready.

The fact that KCP(kamaji control plane) is ready means linking CAPI with Tenant Control Plane is ready. Tenant Control Plane could not be ready even when KCP(kamaji control plane) is ready.

So, we can not trust cluster.status.controlPlaneReady. @prometherion How do you think about it?

Additionally, There is a gap between Cluster(CAPI CRD) and KCP(Kamaji Control Plane). Cluster(CAPI CRD) can't catch changes about KCP(Kamaji Control Plane). It assumes control plane is based on Machine(CAPI CRD).

func (r *Reconciler) SetupWithManager(ctx context.Context, mgr ctrl.Manager, options controller.Options) error {
	c, err := ctrl.NewControllerManagedBy(mgr).
		For(&clusterv1.Cluster{}).
		Watches(
			&clusterv1.Machine{},
			handler.EnqueueRequestsFromMapFunc(r.controlPlaneMachineToCluster),
		).
		WithOptions(options).
		WithEventFilter(predicates.ResourceNotPausedAndHasFilterLabel(ctrl.LoggerFrom(ctx), r.WatchFilterValue)).
		Build(r)

	if err != nil {
		return errors.Wrap(err, "failed setting up with a controller manager")
	}

	r.recorder = mgr.GetEventRecorderFor("cluster-controller")
	r.externalTracker = external.ObjectTracker{
		Controller: c,
		Cache:      mgr.GetCache(),
	}
	return nil
}

@prometherion prometherion added this to the v0.4.2 milestone Dec 13, 2023
@prometherion prometherion added the bug Something isn't working label Dec 13, 2023
@prometherion
Copy link
Member

But it may be not ready because the fact that KCP(kamaji control plane) is ready doesn't mean control plane is ready

That's wrong, we're inheriting that readiness from Kamaji itself, and reporting it back.

Additionally, There is a gap between Cluster(CAPI CRD) and KCP(Kamaji Control Plane). Cluster(CAPI CRD) can't catch changes about KCP(Kamaji Control Plane). It assumes control plane is based on Machine(CAPI CRD).

I would like to discuss this in another issue, rather, do you mind opening it, along with some examples, and the desired outcome?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants