You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, K8up only emits timeseries for schedules for which it's seen at least one Job with the matching completion state, e.g. k8up_jobs_successful_counter will only have timeseries for schedules which have at least one successful job since the last K8up restart.
For schedules with relatively low frequency (e.g. 1/day) this can lead to significant gaps in the metric in Prometheus which confuses Prometheus functions such as rate() which otherwise can compensate for counter resets due to pod restarts.
Additional Context
No response
Logs
No response
Expected Behavior
K8up initializes the counter metrics (k8up_jobs_failed_counter, k8up_jobs_successful_counter, and k8up_jobs_total) with value 0 for all job types and all namespaces in which a Schedule exists immediately after startup.
Steps To Reproduce
Create a Schedule
Check K8up's /metrics endpoint and observe that there's no k8up_jobs_* timeseries for the namespace of the new Schedule until a first job runs.
Version of K8up
v2.7.2
Version of Kubernetes
v1.27.13
Distribution of Kubernetes
OpenShift 4
The text was updated successfully, but these errors were encountered:
For my understanding, the problem would be fixed if K8up would emit those metrics with labels for all existing schedules and a value of 0?
Yes, (if I understand Prometheus's behavior correctly) emitting metrics with labels for all existing schedules and value 0 until the first job is observed would let Prometheus correctly identify the counter resets.
Description
Currently, K8up only emits timeseries for schedules for which it's seen at least one Job with the matching completion state, e.g.
k8up_jobs_successful_counter
will only have timeseries for schedules which have at least one successful job since the last K8up restart.For schedules with relatively low frequency (e.g. 1/day) this can lead to significant gaps in the metric in Prometheus which confuses Prometheus functions such as
rate()
which otherwise can compensate for counter resets due to pod restarts.Additional Context
No response
Logs
No response
Expected Behavior
K8up initializes the counter metrics (
k8up_jobs_failed_counter
,k8up_jobs_successful_counter
, andk8up_jobs_total
) with value 0 for all job types and all namespaces in which a Schedule exists immediately after startup.Steps To Reproduce
Schedule
/metrics
endpoint and observe that there's nok8up_jobs_*
timeseries for the namespace of the new Schedule until a first job runs.Version of K8up
v2.7.2
Version of Kubernetes
v1.27.13
Distribution of Kubernetes
OpenShift 4
The text was updated successfully, but these errors were encountered: