Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve registration on already installed systems #906

Conversation

anmazzotti
Copy link
Contributor

@anmazzotti anmazzotti self-assigned this Jul 10, 2023
@anmazzotti anmazzotti marked this pull request as ready for review July 17, 2023 07:32
@anmazzotti anmazzotti requested a review from a team as a code owner July 17, 2023 07:32
Copy link
Contributor

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good just some copy & paste errors and I wonder if we shouldn't set the timer enabled by default and not include any cloud-config for it at all.

@@ -2,4 +2,4 @@ name: "Elemental operator bootstrap"
stages:
network.after:
- commands:
- systemctl start elemental-register.service
- systemctl start elemental-register.timer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can consider entirely removing this file and start the registration timer by default. The default enabled services are listed in a systemd Elemental branding package in OBS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave it here right now for simplicity, this issue already has three PRs as dependencies.
I can open a follow up issue to discuss this if you think it's the case.

Copy link
Contributor

@davidcassany davidcassany Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in fact. Also thinking it further probably we can keep it like this as running the service from the cloud-config allows us to set a different start or setup for different scenarios (live media, active, fallback or recovery systems).

We could omit it in live media and in that case run something more aggressive like elemental-register --install (assuming we have a flag to trigger installation or not)


%post
%service_add_post elemental-populate-node-labels.service
%service_add_post shutdown-containerd.service
%service_add_post elemental-register.service
%service_add_pre elemental-register.timer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be %service_add_post. Copy & paste error I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely too much copy/paste. Thank you.


%preun
%service_del_preun elemental-populate-node-labels.service
%service_del_preun shutdown-containerd.service
%service_del_preun elemental-register.service
%service_add_pre elemental-register.timer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same sort error here as above, the macro is not correct


%postun
%service_del_postun elemental-populate-node-labels.service
%service_del_postun shutdown-containerd.service
%service_del_postun elemental-register.service
%service_add_pre elemental-register.timer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same sort error here as above, the macro is not correct

@@ -3,10 +3,19 @@ Description=Elemental Register
Documentation=https://elemental.docs.rancher.com
Wants=network-online.target
After=network-online.target
# Backoff after 5 attempts in 5 min
StartLimitIntervalSec=5min
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity. Do we know if once the limit is achieved the timer will restart the counter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity. Do we know if once the limit is achieved the timer will restart the counter?

Yes, this is actually the reason we use CollectMode=inactive-or-failed, so that it will not enter a failed state and the Timer will just restart it.

Copy link
Contributor

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Documentation=https://elemental.docs.rancher.com

[Timer]
OnStartupSec=5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this shouldn't we use the OnBootSec here, but not so sure to be honest. It is not clear to me to what point this delay is applied to. Is it when the timer is started? So it starts the registration after 5seconds the timer starts? If that's the case then this is fine I'd say.

Copy link
Contributor Author

@anmazzotti anmazzotti Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be equivalent for Timers as far I can understand the doc.

Defines a timer relative to when the service manager was first started.  
For system timer units this is very similar to OnBootSec= as the system service manager is generally started very early at boot.  
It's primarily useful when configured in units running in the per-user service manager, as the user service manager is generally started on first login only, not already during boot.

I thought OnStartup would be safer as we have dependencies on network and so on.

@@ -2,4 +2,4 @@ name: "Elemental operator bootstrap"
stages:
network.after:
- commands:
- systemctl start elemental-register.service
- systemctl start elemental-register.timer
Copy link
Contributor

@davidcassany davidcassany Jul 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in fact. Also thinking it further probably we can keep it like this as running the service from the cloud-config allows us to set a different start or setup for different scenarios (live media, active, fallback or recovery systems).

We could omit it in live media and in that case run something more aggressive like elemental-register --install (assuming we have a flag to trigger installation or not)

@anmazzotti anmazzotti merged commit 2680553 into rancher:main Jul 18, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve re-registration behavior on already installed systems
2 participants