You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This isn't a question or user support case (For Q&A and community support, go to Discussions).
I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
To Reproduce
-
Describe the bug
We have a GitHub Action that runs once a day. A special type of runners is allocated specifically for it. During the execution of the GitHub Action, we receive the latest batch of messages about task execution. In this message, the statistics.totalIdleRunners and statistics.totalRegisteredRunners contain non-zero values.
These values are published by the controller as a prometheus metrics. After this last message, the metric values do not change until the next runner execution the following day.
Is it possible to fix this behavior, or does it require changes on the GitHub side?
Describe the expected behavior
The value of the Prometheus metrics ghalistener should reflect the actual state of the runners.
Additional Context
-
Controller Logs
2024-05-26T07:24:29Z INFO listener-app.listener Getting next message {"lastMessageID": 1089}
2024-05-26T07:24:37Z INFO listener-app.listener Processing message {"messageId": 1090, "messageType": "RunnerScaleSetJobMessages"}
2024-05-26T07:24:37Z INFO listener-app.listener New runner scale set statistics. {"statistics": {"totalAvailableJobs":0,"totalAcquiredJobs":3,"totalAssignedJobs":3,"totalRunningJobs":3,"totalRegisteredRunners":4,"totalBusyRunners":3,"totalIdleRunners":0}}
2024-05-26T07:24:37Z INFO listener-app.listener Job completed message received. {"RequestId": 669571, "Result": "succeeded", "RunnerId": 83622, "RunnerName": "terraform-drift-checker-hxppm-runner-9q2hx"}
2024-05-26T07:24:37Z INFO listener-app.listener Deleting last message {"lastMessageID": 1090}
2024-05-26T07:24:38Z INFO listener-app.worker.kubernetesworker Calculated target runner count {"assigned job": 3, "decision": 3, "min": 0, "max": 30, "currentRunnerCount": 3, "jobsCompleted": 1}
2024-05-26T07:24:38Z INFO listener-app.worker.kubernetesworker Compare {"original": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":-1,\"patchID\":-1,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}", "patch": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":3,\"patchID\":6917,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}"}
2024-05-26T07:24:38Z INFO listener-app.worker.kubernetesworker Preparing EphemeralRunnerSet update {"json": "{\"spec\":{\"patchID\":6917,\"replicas\":3}}"}
2024-05-26T07:24:38Z INFO listener-app.worker.kubernetesworker Ephemeral runner set scaled. {"namespace": "github-actions-runner", "name": "terraform-drift-checker-hxppm", "replicas": 3}
2024-05-26T07:24:38Z INFO listener-app.listener Getting next message {"lastMessageID": 1090}
2024-05-26T07:24:50Z INFO listener-app.listener Processing message {"messageId": 1091, "messageType": "RunnerScaleSetJobMessages"}
2024-05-26T07:24:50Z INFO listener-app.listener New runner scale set statistics. {"statistics": {"totalAvailableJobs":0,"totalAcquiredJobs":0,"totalAssignedJobs":0,"totalRunningJobs":0,"totalRegisteredRunners":2,"totalBusyRunners":0,"totalIdleRunners":1}}
2024-05-26T07:24:50Z INFO listener-app.listener Job completed message received. {"RequestId": 669572, "Result": "succeeded", "RunnerId": 83625, "RunnerName": "terraform-drift-checker-hxppm-runner-6dmvl"}
2024-05-26T07:24:50Z INFO listener-app.listener Job completed message received. {"RequestId": 669573, "Result": "succeeded", "RunnerId": 83623, "RunnerName": "terraform-drift-checker-hxppm-runner-lcc6k"}
2024-05-26T07:24:50Z INFO listener-app.listener Job completed message received. {"RequestId": 669574, "Result": "succeeded", "RunnerId": 83624, "RunnerName": "terraform-drift-checker-hxppm-runner-d2bv7"}
2024-05-26T07:24:50Z INFO listener-app.listener Deleting last message {"lastMessageID": 1091}
Runner Pod Logs
-
The text was updated successfully, but these errors were encountered:
verdel
changed the title
Invalid values for the metrics gha_registered_runners and gha_idle_runners in ghalistener<Please write what didn't work for you here>
Invalid values for the metrics gha_registered_runners and gha_idle_runners in ghalistener
May 26, 2024
You are right, we receive an empty batch if no activity is needed, so the metric would be incorrect when the cluster becomes idle.
Ideally, to reflect the correct metric, the changes should be made on the API side. However, we can optimistically set this metric to the desired count when the cluster becomes idle. Let me discuss it with the team, and I'll get back to you with more information ☺️
Checks
Controller Version
0.9.2
Deployment Method
Helm
Checks
To Reproduce
-
Describe the bug
We have a GitHub Action that runs once a day. A special type of runners is allocated specifically for it. During the execution of the GitHub Action, we receive the latest batch of messages about task execution. In this message, the
statistics.totalIdleRunners
andstatistics.totalRegisteredRunners
contain non-zero values.These values are published by the controller as a prometheus metrics. After this last message, the metric values do not change until the next runner execution the following day.
Is it possible to fix this behavior, or does it require changes on the GitHub side?
Describe the expected behavior
The value of the Prometheus metrics ghalistener should reflect the actual state of the runners.
Additional Context
Controller Logs
Runner Pod Logs
The text was updated successfully, but these errors were encountered: