-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support chart, node-exporter: tolerate 2i2c.org/community tainted nodes #3209
support chart, node-exporter: tolerate 2i2c.org/community tainted nodes #3209
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
prometheus-node-exporter: | ||
tolerations: | ||
# Tolerate tainted jupyterhub user nodes | ||
- key: hub.jupyter.org_dedicated | ||
value: user | ||
effect: NoSchedule | ||
- key: hub.jupyter.org/dedicated | ||
value: user | ||
effect: NoSchedule | ||
# Tolerate tainted dask worker nodes | ||
- key: k8s.dask.org_dedicated | ||
value: worker | ||
effect: NoSchedule | ||
- key: k8s.dask.org/dedicated | ||
value: worker | ||
effect: NoSchedule |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the chart's defaults:
tolerations:
- effect: NoSchedule
operator: Exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Thank you @yuvipanda for reviewing!! |
🎉🎉🎉🎉 Monitor the deployment of the hubs here 👉 https://github.com/2i2c-org/infrastructure/actions/runs/6384535727 |
The prometheus-node-exporter chart declares node taint tolerations by default in a way that it allows node-exporter to schedule on nodes with a "NoSchedule" taint, no matter what kind of key/value that goes with the effect of "NoSchedule".
This makes more sense for us than to declare tolerations for all individual taints we may declare, such as 2i2c.org/community.
Without this fix, we end up without node-exporter running on 2i2c.org/community tainted nodes, which in turn makes us unable to get statistics about user pods cpu and memory usage etc for pods on such nodes.