K8s control plane high-availability mode #940

cvetkovic · 2024-02-19T12:26:47Z

NOTE: Only merge this PR in conjunction with vhive-serverless/invitro#379.

Modified setup scripts such that Kubernetes control plane can be deployed in high-availability mode, i.e. mode where etcd, kube-scheduler, kube-controller-manager, and etcd are replicated for fault tolerance purposes. The load balancer sits in front of the API Server replicas and is reachable through the virtual IP address - 10.0.1.254. The setup favors Cloudlab and assumes that all nodes of the cluster are part of 10.0.1.0/24.

I tested the setup on a 5-node xl170 Cloudlab cluster with 1 and 3 control plane replicas. The cluster deploys without any issues and seems to be working fine.

@ustiugov, @leokondrashov: Can you once more check to make sure everything is fine here? I see a lot of changes were made in the setup scripts that I might not be exactly aware of.

The setup visually looks as in the following image.

leokondrashov · 2024-02-21T05:16:00Z

NOTE: Only merge this PR in conjunction with vhive-serverless/invitro#379.

I think we need to introduce versioning of vHive in Invitro repo so that we won't need this note. Because it is not only about this PR but also the ongoing knative and k8s upgrade in vHive.

leokondrashov

Pretty good. I left some questions below about the changes, most are just to confirm that it works as expected (I didn't get deep into the HA setting documentation). The most important part is to document these settings in the qiuckstart or place it in separate .md file with links from quickstart. Also, add line into changelog about it; it seems important enough.

configs/setup/kube.json

leokondrashov · 2024-02-26T06:43:09Z

scripts/cluster/create_multinode_cluster.go

@@ -71,43 +73,22 @@ func CreateMultinodeCluster(stockContainerd string) error {
 	return nil
 }

-// Create kubelet service on master node
-func CreateMasterKubeletService() error {


Don't we need need it? Does it automatically attach to unix:///run/containerd/containerd.sock? That's the most important part here. But also, we are loosing the timeout and verbosity parameters. Do we need them?

Because kubeadm starts kubelet process automatically, hence there is no need to start kubelet manually. Regarding the containerd socket, this is now provided as argument to kubeadm init command.

@leokondrashov: What is the socket path for Firecracker? Should we make this be configurable?

This function doesn't create the kubelet, but rather sets it up with parameters. The socket on the master node doesn't need to be configurable since we use Firecracker only for user containers. But the verbosity and runtime-request-timeout set inside might be used for something.

Hmm. You are right about not starting the kubelet process.

For the runtime-request-timeout parameter, it seems to me we can safely remove this (see DOCS). This should mean that each request from the control plane to the kubelet should be executed within 2 minutes (default), rather than 15 minutes.

The container-runtime has been deprecated since v1.24 according to this.

scripts/cluster/create_multinode_cluster.go

scripts/setup.go

configs/k8s_ha/haproxy.cfg

leokondrashov · 2024-03-05T03:32:09Z

scripts/README.md

@@ -1,87 +1,116 @@
 # vHive Setup Scripts
+
 - [vHive Setup Scripts](#vhive-setup-scripts)


We can remove this syllabus completely, since github has automatic one.

leokondrashov · 2024-03-05T03:49:37Z

scripts/README.md


+For fault tolerance purposes, Kubernetes control plane components can be replicated. This can be done by combining


Still better have a description of what and how options are used (specifically, REGULAR, which appears throughout the docs).

Signed-off-by: Lazar Cvetković <[email protected]>

leokondrashov · 2024-06-12T04:55:32Z

@cvetkovic Hi, what's the status of this PR? My point about the documentation still stands. We should have instructions on how to use the tools in this repo, not just references to general k8s help and a way to use it from Invitro. For example, what would be the steps to setup the HA control plane without all the other things we have in Invitro?

leokondrashov · 2024-06-20T03:42:08Z

scripts/cluster/create_multinode_cluster.go

 	// Original Bash Scripts: scripts/cluster/create_multinode_cluster.sh

-	if err := CreateMasterKubeletService(); err != nil {


This command fixes #967 for the first master node. Without it, we still use the control IP on the master node if we use CloudLab, which can lead to the suspension of the account if traffic is big enough.

All other master nodes use CreateWorkerKubeletService, which has the fix for this problem. But it has a different one: in the Firecracker setup, we use our CRI socket, which we don't need to use on controller nodes.

Please consider running CreateMasterKubeletService for all master nodes.

UPD: running for each node is up to invitro setup scripts, not vhive's, but we need to make it possible from here. And there is not much difference between the master and worker commands. I think we can combine them into a single command setup_kubelet with a sandbox type as an argument. When setting up the node, we always choose containerd for all master and backup nodes and change it only for regular nodes.

leokondrashov · 2024-06-21T06:55:06Z

scripts/cluster/create_multinode_cluster.go

--apiserver-advertise-address=%s \
--cri-socket unix:///run/containerd/containerd.sock \
+
+	command := fmt.Sprintf(`sudo kubeadm init --v=%d \


We need to add --apiserver-advertise-address here and for kubeadm join of control plane nodes. Right now, all the IPs that loadbalancer uses are control network IPs, which is bad (See comment about kubelet creation).

leokondrashov · 2024-06-21T07:03:56Z

scripts/cluster/create_multinode_cluster.go

 	shellOut, err := utils.ExecShellCmd("sed -n '/.*kubeadm join.*/p' < %s/masterNodeInfo | sed -n 's/.*join \\(.*\\):\\(\\S*\\) --token \\(\\S*\\).*/\\1 \\2 \\3/p'", configs.System.TmpDir)
-	if !utils.CheckErrorWithMsg(err, "Failed to extract master node information from logs!\n") {
+	if !utils.CheckErrorWithMsg(err, "Failed to extract API Server address, port, and token from logs!\n") {
 		return err
 	}
 	splittedOut := strings.Split(shellOut, " ")
 	configs.Kube.ApiserverAdvertiseAddress = splittedOut[0]
 	configs.Kube.ApiserverPort = splittedOut[1]
 	configs.Kube.ApiserverToken = splittedOut[2]
+
+	// API Server discovery token
 	shellOut, err = utils.ExecShellCmd("sed -n '/.*sha256:.*/p' < %s/masterNodeInfo | sed -n 's/.*\\(sha256:\\S*\\).*/\\1/p'", configs.System.TmpDir)


These commands parsing tmp/masterNodeInfo return double lines of same info because in the output of kubeadm init there are two lines with kubeadm join (for control plane and workers). Because of that, masterKey.yaml is malformed.

lrq619 · 2024-07-02T02:12:26Z

Hello, what is the status of this PR? Can we merge it now? @leokondrashov

cvetkovic force-pushed the k8s_ha_mode branch from ae1245e to 0160c33 Compare February 20, 2024 11:17

cvetkovic requested review from ustiugov and leokondrashov February 20, 2024 11:27

cvetkovic marked this pull request as ready for review February 20, 2024 11:28

cvetkovic force-pushed the k8s_ha_mode branch from 0160c33 to 6183b2c Compare February 20, 2024 11:36

cvetkovic mentioned this pull request Feb 20, 2024

Setup scripts for Kubernetes control plane high-availability mode vhive-serverless/invitro#379

Open

leokondrashov reviewed Feb 26, 2024

View reviewed changes

cvetkovic requested a review from leokondrashov February 27, 2024 13:22

leokondrashov reviewed Mar 5, 2024

View reviewed changes

cvetkovic added 7 commits March 22, 2024 09:24

Control plane replication

fa014a2

Signed-off-by: Lazar Cvetković <[email protected]>

Keepalived health script install bugfix

b00bdc7

Signed-off-by: Lazar Cvetković <[email protected]>

Interface substitution script

b9b62b8

Signed-off-by: Lazar Cvetković <[email protected]>

Kubeadm fix with VRRP

b91ec0f

Signed-off-by: Lazar Cvetković <[email protected]>

Disabling manual kubelet startup

71ca4be

Signed-off-by: Lazar Cvetković <[email protected]>

Fixing bugs

c9a464e

Signed-off-by: Lazar Cvetković <[email protected]>

Addressing Leonid's comments

a167a9a

Signed-off-by: Lazar Cvetković <[email protected]>

cvetkovic force-pushed the k8s_ha_mode branch from ef85ebd to a167a9a Compare March 22, 2024 08:28

leokondrashov requested changes Jun 21, 2024

View reviewed changes

leokondrashov reviewed Jun 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K8s control plane high-availability mode #940

K8s control plane high-availability mode #940

cvetkovic commented Feb 19, 2024 •

edited

Loading

leokondrashov commented Feb 21, 2024

leokondrashov left a comment

leokondrashov Feb 26, 2024

cvetkovic Feb 27, 2024

leokondrashov Feb 27, 2024

cvetkovic Feb 27, 2024 •

edited

Loading

leokondrashov Mar 5, 2024

leokondrashov Mar 5, 2024

leokondrashov commented Jun 12, 2024

leokondrashov Jun 20, 2024 •

edited

Loading

leokondrashov Jun 21, 2024

leokondrashov Jun 21, 2024

lrq619 commented Jul 2, 2024

		@@ -1,87 +1,116 @@
		# vHive Setup Scripts

		- [vHive Setup Scripts](#vhive-setup-scripts)


		For fault tolerance purposes, Kubernetes control plane components can be replicated. This can be done by combining

		// Original Bash Scripts: scripts/cluster/create_multinode_cluster.sh

		if err := CreateMasterKubeletService(); err != nil {

K8s control plane high-availability mode #940

Are you sure you want to change the base?

K8s control plane high-availability mode #940

Conversation

cvetkovic commented Feb 19, 2024 • edited Loading

leokondrashov commented Feb 21, 2024

leokondrashov left a comment

Choose a reason for hiding this comment

leokondrashov Feb 26, 2024

Choose a reason for hiding this comment

cvetkovic Feb 27, 2024

Choose a reason for hiding this comment

leokondrashov Feb 27, 2024

Choose a reason for hiding this comment

cvetkovic Feb 27, 2024 • edited Loading

Choose a reason for hiding this comment

leokondrashov Mar 5, 2024

Choose a reason for hiding this comment

leokondrashov Mar 5, 2024

Choose a reason for hiding this comment

leokondrashov commented Jun 12, 2024

leokondrashov Jun 20, 2024 • edited Loading

Choose a reason for hiding this comment

leokondrashov Jun 21, 2024

Choose a reason for hiding this comment

leokondrashov Jun 21, 2024

Choose a reason for hiding this comment

lrq619 commented Jul 2, 2024

cvetkovic commented Feb 19, 2024 •

edited

Loading

cvetkovic Feb 27, 2024 •

edited

Loading

leokondrashov Jun 20, 2024 •

edited

Loading