K8s Runtime从Docker修改为Containerd实践


从k8s1.24版本开始默认使用containerd作为容器运行时,当我们1.24之前版本的k8s集群想要升级的话,第一步就是要将Docker改为Containerd。

官方升级说明文档地址:https://kubernetes.io/zh-cn/docs/tasks/administer-cluster/migrating-from-dockershim/change-runtime-containerd/

下面介绍下我升级Runtime的实践:

首先,为了最小化影响集群服务,需要逐台节点升级,每台升级成功,在升级下一台节点:

0. 设置节点不可调度

# kubectl cordon k8s1
node/k8s1 cordoned

# kubectl get node 
NAME            STATUS                     ROLES    AGE   VERSION
k8s1            Ready,SchedulingDisabled   <none>   39d   v1.23.17
k8s2            Ready                      <none>   39d   v1.23.17
k8s3            Ready                      <none>   39d   v1.23.17

节点被标记为:SchedulingDisabled

1. 驱逐节点上运行的Pod

# kubectl drain k8s1 --ignore-daemonsets --delete-emptydir-data --force

等待k8s1节点上的pod被驱逐。

执行完毕后,查一下哪些pod还未被驱逐:

# kubectl get po -A -owide | grep k8s1
kb-system              csi-s3-59vwz                                  
kube-system            calico-node-wkdlm                       
kubevirt               virt-handler-75jb9                              
monitoring             node-exporter-9lbtg                       
rook-ceph              csi-cephfsplugin-8777w                  
rook-ceph              csi-rbdplugin-xzwrs                        

后续需要手动重启这些pod。

2. 关闭节点kubelet服务

systemctl stop kubelet

3. 安装Containerd

yum install containerd -y

配置Containerd所需要系统模块:

cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF

加载模块:

modprobe -- overlay
modprobe -- br_netfilter

配置内核参数:

cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF

加载内核参数:

sysctl --system

配置Containerd的配置文件:

mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml

将Containerd的Cgroup改为Systemd:

vim /etc/containerd/config.toml

找到containerd.runtimes.runc.options,添加SystemdCgroup = true(如果已存在直接修改)

将sandbox_image的Pause镜像改成符合自己版本的地址:

docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6 k8s1:5000/pause:3.6
docker push k8s1:5000/pause:3.6
vim /etc/containerd/config.toml

找到sandbox_image 改为: sandbox_image = "k8s1:5000/pause:3.6"

启动Containerd:

systemctl daemon-reload
systemctl enable --now containerd
systemctl status containerd

4. 增加非SSL私有仓库

修改config.toml文件:

vim /etc/containerd/config.toml

在registry分类中指定配置路径/etc/containerd/certs.d:

[plugins."io.containerd.grpc.v1.cri".registry]
  config_path = "/etc/containerd/certs.d"

以k8s1:5000为例:

mkdir -p /etc/containerd/certs.d/k8s1:5000/
vim /etc/containerd/certs.d/k8s1:5000/hosts.toml

server = "http://k8s1:5000"

[host."http://k8s1:5000"]
  capabilities = ["pull", "resolve", "push"]
  skip_verify = true

重启Containerd:

systemctl restart containerd

5. 修改kubelet服务使用Containerd作为Runtime

备份原来配置:

cp /etc/systemd/system/kubelet.service.d/10-kubelet.conf /etc/systemd/system/kubelet.service.d/10-kubelet.conf.docker

修改kubelet配置使用containerd作为Runtime(注意:我k8s版本是1.23.17版本):

vim /etc/systemd/system/kubelet.service.d/10-kubelet.conf

[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig"
Environment="KUBELET_SYSTEM_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin --container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock --cgroup-driver=systemd"
Environment="KUBELET_CONFIG_ARGS=--config=/etc/kubernetes/kubelet-conf.yml"
Environment="KUBELET_EXTRA_ARGS=--node-labels=node.kubernetes.io/node='' "
ExecStart=
ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_SYSTEM_ARGS $KUBELET_EXTRA_ARGS

启动kubelet:

systemctl daemon-reload
systemctl enable --now kubelet
systemctl status kubelet

6. 查看Runtime是否被替换成功

# kubectl get node -owide 
NAME            STATUS                     ROLES    AGE   VERSION    INTERNAL-IP     EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION                CONTAINER-RUNTIME
k8s1            Ready,SchedulingDisabled   <none>   39d   v1.23.17   172.16.58.101   <none>        CentOS Linux 7 (Core)   4.19.12-1.el7.elrepo.x86_64   containerd://1.6.24
k8s2            Ready                      <none>   39d   v1.23.17   172.16.58.102   <none>        CentOS Linux 7 (Core)   4.19.12-1.el7.elrepo.x86_64   docker://20.10.24
k8s3            Ready                      <none>   39d   v1.23.17   172.16.58.103   <none>        CentOS Linux 7 (Core)   4.19.12-1.el7.elrepo.x86_64   docker://20.10.24

可以看到k8s1节点的CONTAINER-RUNTIME改为了containerd://1.6.24

7. 恢复节点可调度

kubectl uncordon k8s1

8. 手动删除之前未驱逐的Pod

以Docker运行的未驱逐的Pod:

kb-system              csi-s3-59vwz                                         
kube-system            calico-node-wkdlm                              
kubevirt               virt-handler-75jb9                                    
monitoring             node-exporter-9lbtg                              
rook-ceph              csi-cephfsplugin-8777w                       
rook-ceph              csi-rbdplugin-xzwrs                             

删除Pod使其重启:

kubectl delete po -n kb-system csi-s3-59vwz
kubectl delete po -n kube-system calico-node-wkdlm 
kubectl delete po -n kubevirt  virt-handler-75jb9 
kubectl delete po -n monitoring node-exporter-9lbtg     
kubectl delete po -n rook-ceph csi-cephfsplugin-8777w csi-rbdplugin-xzwrs

9. 以此类推升级第下一个节点