-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhancement: Check cloud-init status #955
Comments
/triage accepted |
CAPI uses a sentinel file to check if bootstrapping succeeded. Some providers, like capz, do check this file as an indication of successful bootstrapping. |
NOTE to everybody: Fell free to work on this and contact/ping @lentzi90 or ask it here if you have any question! |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
Stale issues close after 30d of inactivity. Reopen the issue with /close |
@metal3-io-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@furkatgofurov7: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/remove-help |
/help |
@furkatgofurov7: Please ensure the request meets the requirements listed here. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with /lifecycle stale |
/lifecycle frozen |
This would be a nice feature in the kubadm-bootstrap-operator |
We recently had a problem in the CI where the OS we use for the Nodes had an update that made one of our
preKubeadmCommands
fail. This is in turn caused some network issues, which was what we detected first. It took quite some time before we managed to figure out the root cause because no one suspected that there was an error in the cloud-init commands. The Machines were provisioned, the cluster worked as expected in many ways, all Nodes healthy.My suggestion is that we add a step to the integration tests (or even to the controller if possible) to detect errors in cloud-init and report them in a more obvious way. In the CI we should simply be able to check
cloud-init status
and error out if it is set tostatus: error
.To be clear, this check should be done on the workload cluster Nodes.
The text was updated successfully, but these errors were encountered: