Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-72284] Take agents offline and online immediately #198

Merged
merged 13 commits into from
Feb 9, 2024

Conversation

mawinter69
Copy link
Contributor

@mawinter69 mawinter69 commented Nov 19, 2023

[JENKINS-72284] Take agents offline and online immediately

[JENKINS-72284] when an agent with the wrong java version connects, it will be taken offline right away. Previously, the agent was only taken offline when one visited the /computer page.

Use monitor specific offline causes that inherit from MonitorOfflineCause. This allows the agent to be brought online again automatically, after it has been restarted with the correct java/remoting version. Jenkins 2.433 and later will then reset the Admin monitor if that was the only reason that the agent was taken offline.

Jenkins core 2.434 or newer made 2 methods public that are required for this pull request.

Requires Jenkins 2.440 or newer because that has been selected as the next LTS baseline.

Testing done

I compared 2.426.3 with the existing release, 2.444 with the existing release, and 2.444 with the incremental build of this plugin.

2.426.3 and current plugin release

  • Confirmed that a Pipeline job that requires FreeBSD will use Java 17 on my Jenkins 2.426.3 controller that runs Java 21 (since Java 21 is not yet available as a FreeBSD package):
pipeline {
    agent {
        label 'FreeBSD'
    }
    stages {
        stage('FreeBSD-Agent') {
            steps {
                sh 'hostname && uname -a && java -version'
            }
        }
    }
}
  • Confirmed that when I click the "Apply" button on the "Node monitoring configuration" page on Jenkins 2.426.3 after enabling "Disconnect agent when incompatibility is found", the FreeBSD agents remain connected even though they should have been disconnected. When the Pipeline job runs, it reports that it was run on a FreeBSD agent and used Java 17
  • Confirmed that when I visit the "Computer" page http://mark-pc2.markwaite.net:8080/computer/ then the FreeBSD agents are disconnected and the Pipeline job blocks waiting for a FreeBSD agent
  • Confirmed that when I disable the "Disconnect agent when incompatibility is found" setting and click the "Apply" button, the FreeBSD agents remain offline even though the setting is now disabled
  • Confirmed that when I visit each agent page and click "Bring this node online", the FreeBSD agent is brought online and is running Java 17
  • Confirmed that the Pipeline runs to completion once a FreeBSD agent is online
  • Confirmed that the nodeMonitors configuration as code uses the following successfully:
  nodeMonitors:
  - diskSpaceMonitor:
      freeSpaceThreshold: "1GB"
  - tmpSpace:
      freeSpaceThreshold: "1GB"
  - jvmVersion:
      comparisonMode: RUNTIME_GREATER_OR_EQUAL_MASTER_BYTECODE
      disconnect: false
  - "remotingVersion"

2.444 and current plugin release

  • Confirmed that the nodeMonitors configuration as code passes Jenkins startup when using:
  nodeMonitors:
  - diskSpaceMonitor:
      freeSpaceThreshold: "1GB"
  - tmpSpace:
      freeSpaceThreshold: "1GB"
  - jvmVersion:
      comparisonMode: RUNTIME_GREATER_OR_EQUAL_MASTER_BYTECODE
      disconnect: false
  - "remotingVersion"
  • Confirmed that disabling or enabling "Don't mark agents temporarily offline" and clicking "Apply" does not change the connection status of the FreeBSD agent that is running Java 17 and connected to a Java 21 controller

2.440.1-rc and this pull request

  • Confirmed that the nodeMonitors configuration as code loads correctly when using:
  nodeMonitors:
  - diskSpaceMonitor:
      freeSpaceThreshold: "1GB"
  - tmpSpace:
      freeSpaceThreshold: "1GB"
  - jvmVersion:
      comparisonMode: RUNTIME_GREATER_OR_EQUAL_MASTER_BYTECODE
      disconnect: false
  - "remotingVersion"
  • Confirmed that the agents are immediately taken offline when the new checkbox "Don't mark agents temporarily offline" is disabled
  • Confirmed that agents are not taken offline when the checkbox "Don't mark agents temporarily offline" is enabled
  • Confirmed that agents are brought online immediately after the "Don't mark agents temporarily offline" checkbox is disabled and the "Apply" button is clicked

The new behavior of immediately taking the agent offline and immediately bringing it online when the change is applied is very, very nice. Thanks!

Submitter checklist

  • Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
  • Ensure that the pull request title represents the desired changelog entry
  • Please describe what you did
  • Link to relevant issues in GitHub or Jira
  • Link to relevant pull requests, esp. upstream and downstream changes
  • Ensure you have provided tests - that demonstrates feature works or fixes the issue

[JENKINS-72284] when an agent with wrong java version connects, it will
be taken offline right away. Before the agent was only taken offline
when one visited the /computer page.
Use monitor specific offline causes that inherit from
MonitorOfflineCause. This allows to take the agent automatically online
again, after it was restarted with the correct java/remoting version.
Also with jenkinsci/jenkins#8618 it would then
reset the Admin monitor if that was the only reason that it fired.

Requires jenkinsci/jenkins#8593 that made 2
mthods public.
@MarkEWaite MarkEWaite changed the title improve how agents are token offline and online improve how agents are taken offline and online Nov 21, 2023
@viceice
Copy link
Member

viceice commented Feb 6, 2024

Anything missing here? i really like to have the warning automatically disabled when node is online again. Currently this needs a jenkins restart

@MarkEWaite
Copy link
Contributor

MarkEWaite commented Feb 6, 2024

Anything missing here? i really like to have the warning automatically disabled when node is online again. Currently this needs a jenkins restart

I was waiting for the pull request to no longer have conflicting files and to not be a draft pull request. Have you run it in your environment? Does it meet your needs? If so, then you can use the incremental build.

@github-actions github-actions bot added the dependencies Dependency related change label Feb 6, 2024
@mawinter69
Copy link
Contributor Author

I need to review the behaviour and remove the now duplicate disconnect for the jvm monitor

@github-actions github-actions bot added the tests Automated test addition or improvement label Feb 6, 2024
@mawinter69 mawinter69 marked this pull request as ready for review February 6, 2024 17:40
@mawinter69 mawinter69 requested a review from a team as a code owner February 6, 2024 17:40
@MarkEWaite MarkEWaite added enhancement Improvement or new feature and removed dependencies Dependency related change tests Automated test addition or improvement labels Feb 6, 2024
@MarkEWaite MarkEWaite changed the title improve how agents are taken offline and online Take agents offline and online more promptly Feb 6, 2024
@MarkEWaite MarkEWaite changed the title Take agents offline and online more promptly Take agents offline and online immediately for version compatibility issues Feb 6, 2024
@MarkEWaite MarkEWaite changed the title Take agents offline and online immediately for version compatibility issues JENKINS-72284] Take agents offline and online immediately for version compatibility issues Feb 6, 2024
@MarkEWaite MarkEWaite changed the title JENKINS-72284] Take agents offline and online immediately for version compatibility issues [JENKINS-72284] Take agents offline and online immediately for version compatibility issues Feb 6, 2024
@MarkEWaite MarkEWaite changed the title [JENKINS-72284] Take agents offline and online immediately for version compatibility issues [JENKINS-72284] Immediately take agents offline and online on version compatibility issues Feb 6, 2024
@MarkEWaite MarkEWaite changed the title [JENKINS-72284] Immediately take agents offline and online on version compatibility issues [JENKINS-72284] Immediately take agents offline and online for version compatibility issues Feb 6, 2024
@MarkEWaite MarkEWaite added the breaking Breaks compatibility with previous releases label Feb 7, 2024
@github-actions github-actions bot added the dependencies Dependency related change label Feb 7, 2024
@github-actions github-actions bot added documentation Improvements or additions to documentation tests Automated test addition or improvement labels Feb 7, 2024
@mawinter69
Copy link
Contributor Author

@MarkEWaite fixed the casc issue


// should be restricted/deprecated but that breaks casc
@DataBoundSetter
public void setDisconnect(boolean disconnect) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That method is required for backward compatibility with CasC. Ideally it should be deprecated (likewise isDisconnected should also be deprecated). But when doing so CasC will fail unless one adds

configuration-as-code:
  version: 1
  deprecated: warn

to the CasC config.
Similarly using @Restricted(DoNotUse.class) makes CasC fail

@MarkEWaite MarkEWaite removed breaking Breaks compatibility with previous releases dependencies Dependency related change documentation Improvements or additions to documentation tests Automated test addition or improvement labels Feb 9, 2024
@MarkEWaite MarkEWaite changed the title [JENKINS-72284] Immediately take agents offline and online for version compatibility issues [JENKINS-72284] Immediately take agents offline and online for compatibility issues Feb 9, 2024
@MarkEWaite MarkEWaite changed the title [JENKINS-72284] Immediately take agents offline and online for compatibility issues [JENKINS-72284] Take agents offline and online immediately Feb 9, 2024
Copy link
Contributor

@MarkEWaite MarkEWaite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much. Testing looks good. Ready to release.

@MarkEWaite MarkEWaite merged commit 47a3737 into jenkinsci:master Feb 9, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improvement or new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants