Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galaxy can not create pod sucess with containerd runtime #137

Open
Huimintai opened this issue Jul 13, 2021 · 2 comments
Open

Galaxy can not create pod sucess with containerd runtime #137

Huimintai opened this issue Jul 13, 2021 · 2 comments

Comments

@Huimintai
Copy link
Contributor

TKEStack with containerd runtime engine can not create pods sucess with errors:

# kubectl get pods -n kube-system
NAME                                     READY   STATUS              RESTARTS   AGE
coredns-ccc77fb9d-d8vsj                  1/1     Running             0          17h
coredns-ccc77fb9d-l2jn2                  1/1     Running             0          17h
etcd-10.0.32.211                         1/1     Running             0          17h
flannel-rcf9p                            1/1     Running             0          17h
galaxy-daemonset-b88np                   1/1     Running             0          17h
kube-apiserver-10.0.32.211               1/1     Running             0          17h
kube-controller-manager-10.0.32.211      1/1     Running             0          17h
kube-proxy-llkg6                         1/1     Running             0          17h
kube-scheduler-10.0.32.211               1/1     Running             0          17h
metrics-server-v0.3.6-59c66b5dfd-57zz4   2/2     Running             0          17h
metrics-server-v0.3.6-794ccd69c8-6zdrg   0/2     ContainerCreating   0          17h
# kubectl describe pods metrics-server-v0.3.6-794ccd69c8-6zdrg -n kube-system
 Warning  FailedCreatePodSandBox  94s (x4848 over 17h)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2869e68816e95f2eeaa62f26905c049fd4e70240e5a73bf4c30875610d9c8aef": galaxy returns: fail to establish network map[]:failed to open netns "/var/run/netns/cni-93beae62-e333-54c2-12f3-49069a567f4b": failed to Statfs "/var/run/netns/cni-93beae62-e333-54c2-12f3-49069a567f4b": no such file or directory

The galaxy error log:

I0713 02:08:18.298218  184815 server.go:114] ADD metrics-server-v0.3.6-794ccd69c8-6zdrg_kube-system, 16aedda9d4d54b18301610162a4e4d24397a2455fb49a1b2ec6fe270084e9585, /var/run/netns/cni-aee74e86-14a1-fcae-26bd-e96cdf62fa02, [], Jul 13 02:08:18.298209+
I0713 02:08:18.300783  184815 cni.go:93] delegate add 16aedda9d4d54b18301610162a4e4d24397a2455fb49a1b2ec6fe270084e9585 args K8S_POD_NAME=metrics-server-v0.3.6-794ccd69c8-6zdrg;K8S_POD_INFRA_CONTAINER_ID=16aedda9d4d54b18301610162a4e4d24397a2455fb49a1b2ec6fe270084e9585;IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system conf {"delegate":{"type":"galaxy-veth"},"name":"galaxy-flannel","subnetFile":"/run/flannel/subnet.env","type":"galaxy-flannel"}
E0713 02:08:18.335744  184815 cni.go:146] fail to add network map[]: failed to open netns "/var/run/netns/cni-aee74e86-14a1-fcae-26bd-e96cdf62fa02": failed to Statfs "/var/run/netns/cni-aee74e86-14a1-fcae-26bd-e96cdf62fa02": no such file or directory, begin to rollback and delete it
I0713 02:08:18.335898  184815 cni.go:114] delegate del 16aedda9d4d54b18301610162a4e4d24397a2455fb49a1b2ec6fe270084e9585 args K8S_POD_NAME=metrics-server-v0.3.6-794ccd69c8-6zdrg;K8S_POD_INFRA_CONTAINER_ID=16aedda9d4d54b18301610162a4e4d24397a2455fb49a1b2ec6fe270084e9585;IgnoreUnknown=1;K8S_POD_NAMESPACE=kube-system conf {"delegate":{"type":"galaxy-veth"},"name":"galaxy-flannel","subnetFile":"/run/flannel/subnet.env","type":"galaxy-flannel"}
W0713 02:08:18.342705  184815 cni.go:148] fail to delete cni in rollback <nil>

But when I do not install galaxy the metrics-server can runnning well:

root@VM-32-165-ubuntu:~# kubectl get pods -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
coredns-ccc77fb9d-qgx2g                    1/1     Running   1          9m46s
coredns-ccc77fb9d-wlb82                    1/1     Running   1          9m46s
etcd-vm-32-165-ubuntu                      1/1     Running   2          10m
kube-apiserver-vm-32-165-ubuntu            1/1     Running   2          10m
kube-controller-manager-vm-32-165-ubuntu   1/1     Running   1          10m
kube-proxy-8sqdn                           1/1     Running   1          9m46s
kube-scheduler-vm-32-165-ubuntu            1/1     Running   7          6m25s
metrics-server-v0.3.6-794ccd69c8-wfv7d     2/2     Running   3          9m32s

Also when I install community flannel the metrics-server also can running well withhout any errors:

# kubectl get pods -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
coredns-ccc77fb9d-qgx2g                    1/1     Running   1          13m
coredns-ccc77fb9d-wlb82                    1/1     Running   1          13m
etcd-vm-32-165-ubuntu                      1/1     Running   2          14m
kube-apiserver-vm-32-165-ubuntu            1/1     Running   2          14m
kube-controller-manager-vm-32-165-ubuntu   1/1     Running   1          14m
kube-flannel-ds-2grmk                      1/1     Running   0          69s
kube-proxy-8sqdn                           1/1     Running   1          13m
kube-scheduler-vm-32-165-ubuntu            1/1     Running   7          10m
metrics-server-v0.3.6-794ccd69c8-wfv7d     2/2     Running   3          13m
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~# kubectl delete pods metrics-server-v0.3.6-794ccd69c8-wfv7d -n kube-system
pod "metrics-server-v0.3.6-794ccd69c8-wfv7d" deleted
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~# kubectl get pods -n kube-system
NAME                                       READY   STATUS    RESTARTS   AGE
coredns-ccc77fb9d-qgx2g                    1/1     Running   1          13m
coredns-ccc77fb9d-wlb82                    1/1     Running   1          13m
etcd-vm-32-165-ubuntu                      1/1     Running   2          14m
kube-apiserver-vm-32-165-ubuntu            1/1     Running   2          14m
kube-controller-manager-vm-32-165-ubuntu   1/1     Running   1          14m
kube-flannel-ds-2grmk                      1/1     Running   0          93s
kube-proxy-8sqdn                           1/1     Running   1          13m
kube-scheduler-vm-32-165-ubuntu            1/1     Running   7          10m
metrics-server-v0.3.6-794ccd69c8-grvvq     2/2     Running   0          15s
root@VM-32-165-ubuntu:~# ls /var/run/netns/
cni-62aefc67-2e1a-3287-bb49-123ffc5eb62a  cni-99a91844-ea0c-96ef-79ae-5b43e1b5aa28
cni-99744035-01d0-1a18-2ec9-4a94c68cf683
@Huimintai
Copy link
Contributor Author

This is community flannel CNI:

root@VM-32-165-ubuntu:~# kubectl get pods -n kube-system -o wide
NAME                                       READY   STATUS    RESTARTS   AGE     IP             NODE               NOMINATED NODE   READINESS GATES
coredns-ccc77fb9d-65pn9                    1/1     Running   0          4m25s   10.244.0.226   vm-32-165-ubuntu   <none>           <none>
coredns-ccc77fb9d-gvbf9                    1/1     Running   0          4m25s   10.244.0.227   vm-32-165-ubuntu   <none>           <none>
etcd-vm-32-165-ubuntu                      1/1     Running   2          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-apiserver-vm-32-165-ubuntu            1/1     Running   2          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-controller-manager-vm-32-165-ubuntu   1/1     Running   1          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-flannel-ds-2grmk                      1/1     Running   0          5h1m    10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-proxy-8sqdn                           1/1     Running   1          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-scheduler-vm-32-165-ubuntu            1/1     Running   7          5h9m    10.0.32.165    vm-32-165-ubuntu   <none>           <none>
metrics-server-v0.3.6-794ccd69c8-rwcx9     2/2     Running   0          4m25s   10.244.0.225   vm-32-165-ubuntu   <none>           <none>
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~# ls /var/run/netns/
cni-05418fc3-ad75-1cbb-1705-b1fb687e7b74  cni-0e68f071-ba2b-cd80-1a00-19b97cea41fa  cni-b53efb6b-431f-acd8-9ee3-484ee3dda141

@blue-troy
Copy link
Contributor

This is community flannel CNI:

root@VM-32-165-ubuntu:~# kubectl get pods -n kube-system -o wide
NAME                                       READY   STATUS    RESTARTS   AGE     IP             NODE               NOMINATED NODE   READINESS GATES
coredns-ccc77fb9d-65pn9                    1/1     Running   0          4m25s   10.244.0.226   vm-32-165-ubuntu   <none>           <none>
coredns-ccc77fb9d-gvbf9                    1/1     Running   0          4m25s   10.244.0.227   vm-32-165-ubuntu   <none>           <none>
etcd-vm-32-165-ubuntu                      1/1     Running   2          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-apiserver-vm-32-165-ubuntu            1/1     Running   2          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-controller-manager-vm-32-165-ubuntu   1/1     Running   1          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-flannel-ds-2grmk                      1/1     Running   0          5h1m    10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-proxy-8sqdn                           1/1     Running   1          5h13m   10.0.32.165    vm-32-165-ubuntu   <none>           <none>
kube-scheduler-vm-32-165-ubuntu            1/1     Running   7          5h9m    10.0.32.165    vm-32-165-ubuntu   <none>           <none>
metrics-server-v0.3.6-794ccd69c8-rwcx9     2/2     Running   0          4m25s   10.244.0.225   vm-32-165-ubuntu   <none>           <none>
root@VM-32-165-ubuntu:~#
root@VM-32-165-ubuntu:~# ls /var/run/netns/
cni-05418fc3-ad75-1cbb-1705-b1fb687e7b74  cni-0e68f071-ba2b-cd80-1a00-19b97cea41fa  cni-b53efb6b-431f-acd8-9ee3-484ee3dda141

galaxy use docker's sock to get dockerclient, you can se the go mod github.com/docker/engine-api v0.4.0. it is a better idea to support different cri is to use cri interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants