Enhancement: Check cloud-init status #955

lentzi90 · 2022-02-24T09:42:49Z

We recently had a problem in the CI where the OS we use for the Nodes had an update that made one of our preKubeadmCommands fail. This is in turn caused some network issues, which was what we detected first. It took quite some time before we managed to figure out the root cause because no one suspected that there was an error in the cloud-init commands. The Machines were provisioned, the cluster worked as expected in many ways, all Nodes healthy.

My suggestion is that we add a step to the integration tests (or even to the controller if possible) to detect errors in cloud-init and report them in a more obvious way. In the CI we should simply be able to check cloud-init status and error out if it is set to status: error.
To be clear, this check should be done on the workload cluster Nodes.

The text was updated successfully, but these errors were encountered:

Rozzii · 2022-03-02T14:23:24Z

/triage accepted
/kind feature
/help

Arvinderpal · 2022-03-02T14:38:44Z

CAPI uses a sentinel file to check if bootstrapping succeeded.
kubernetes-sigs/cluster-api#3716

Some providers, like capz, do check this file as an indication of successful bootstrapping.
For baremetal, the tricky part I think is accessing this file from the management cluster.

Rozzii · 2022-03-16T09:28:32Z

NOTE to everybody: Fell free to work on this and contact/ping @lentzi90 or ask it here if you have any question!

metal3-io-bot · 2022-06-14T10:27:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

metal3-io-bot · 2022-07-14T10:28:40Z

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

metal3-io-bot · 2022-07-14T10:28:43Z

@metal3-io-bot: Closing this issue.

In response to this:

Stale issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle stale.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

furkatgofurov7 · 2022-07-14T10:33:22Z

/reopen
/remove-lifecycle stale

metal3-io-bot · 2022-07-14T10:33:24Z

@furkatgofurov7: Reopened this issue.

In response to this:

/reopen
/remove-lifecycle stale

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

furkatgofurov7 · 2022-10-05T14:37:38Z

/remove-help

furkatgofurov7 · 2022-10-05T14:50:29Z

/help

metal3-io-bot · 2022-10-05T15:01:53Z

@furkatgofurov7:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

metal3-io-bot · 2023-01-03T15:31:19Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues will close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Rozzii · 2023-02-01T14:29:29Z

/lifecycle frozen
Would be a good improvement in the future.

Rozzii · 2023-03-29T14:34:35Z

This would be a nice feature in the kubadm-bootstrap-operator

metal3-io-bot added the needs-triage Indicates an issue lacks a `triage/foo` label and requires one. label Feb 24, 2022

metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2022

metal3-io-bot closed this as completed Jul 14, 2022

metal3-io-bot reopened this Jul 14, 2022

metal3-io-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 14, 2022

metal3-io-bot removed the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Oct 5, 2022

metal3-io-bot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Oct 5, 2022

metal3-io-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2023

metal3-io-bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 1, 2023

Rozzii added this to Metal3 - Roadmap Jun 28, 2024

Rozzii moved this to Backlog in Metal3 - Roadmap Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Check cloud-init status #955

Enhancement: Check cloud-init status #955

lentzi90 commented Feb 24, 2022

Rozzii commented Mar 2, 2022

Arvinderpal commented Mar 2, 2022

Rozzii commented Mar 16, 2022

metal3-io-bot commented Jun 14, 2022

metal3-io-bot commented Jul 14, 2022

metal3-io-bot commented Jul 14, 2022

furkatgofurov7 commented Jul 14, 2022

metal3-io-bot commented Jul 14, 2022

furkatgofurov7 commented Oct 5, 2022

furkatgofurov7 commented Oct 5, 2022

metal3-io-bot commented Oct 5, 2022

metal3-io-bot commented Jan 3, 2023

Rozzii commented Feb 1, 2023

Rozzii commented Mar 29, 2023

Enhancement: Check cloud-init status #955

Enhancement: Check cloud-init status #955

Comments

lentzi90 commented Feb 24, 2022

Rozzii commented Mar 2, 2022

Arvinderpal commented Mar 2, 2022

Rozzii commented Mar 16, 2022

metal3-io-bot commented Jun 14, 2022

metal3-io-bot commented Jul 14, 2022

metal3-io-bot commented Jul 14, 2022

furkatgofurov7 commented Jul 14, 2022

metal3-io-bot commented Jul 14, 2022

furkatgofurov7 commented Oct 5, 2022

furkatgofurov7 commented Oct 5, 2022

metal3-io-bot commented Oct 5, 2022

metal3-io-bot commented Jan 3, 2023

Rozzii commented Feb 1, 2023

Rozzii commented Mar 29, 2023