在以前正常运行的环境然反馈Kubernetes集群不可用,不可执行kubectl
命令。
查看kubelet
服务状态,日志中的提示node "master1" not found
。
[root@master1 var]# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Thu 2022-05-05 10:32:20 CST; 13s ago Docs: https://kubernetes.io/docs/ Main PID: 15016 (kubelet) Tasks: 14 Memory: 38.7M CGroup: /system.slice/kubelet.service └─15016 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod... May 05 10:32:32 master1 kubelet[15016]: E0505 10:32:32.353984 15016 kubelet.go:2263] node "master1" not found May 05 10:32:32 master1 kubelet[15016]: E0505 10:32:32.454152 15016 kubelet.go:2263] node "master1" not found May 05 10:32:32 master1 kubelet[15016]: E0505 10:32:32.554312 15016 kubelet.go:2263] node "master1" not found May 05 10:32:32 master1 kubelet[15016]: E0505 10:32:32.654451 15016 kubelet.go:2263] node "master1" not found May 05 10:32:32 master1 kubelet[15016]: E0505 10:32:32.754553 15016 kubelet.go:2263] node "master1" not found
master1节点是本机,网络正常,试着重启kubelet
服务无果。
在输入命令时无意中使用tab键补充命令,得到错误提示:
[root@master1 log]# cat yum-bash: cannot create temp file for here-document: No space left on device
发现应该是磁盘空间不足,检查磁盘使用量,发现磁盘占100%。
[root@master1 var]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos_202-root 39G 39G 20K 100% / dvtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 4.0K 3.9G 1% /dev/shm
tmpfs 3.9G 385M 3.5G 10% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/loop0 11G 11G 0 100% /opt/centos
定位到docker容器目录下,有个容器占用了大量的磁盘,因为该容器长时间输出错误日志导致。
[root@master1 0d7b200dd9209005759e20e3286a21b1435ff05c2179847a5f294deeb9bba52f]# ls -hlst
total 11G
11G -rw-r----- 1 root root 11G May 3 06:51 0d7b200dd9209005759e20e3286a21b1435ff05c2179847a5f294deeb9bba52f-json.log
8.0K -rw------- 1 root root 5.8K Apr 24 09:29 config.v2.json
4.0K -rw-r--r-- 1 root root 2.0K Apr 24 09:29 hostconfig.json
0 drwx------ 2 root root 6 Apr 24 09:29 mounts
0 drwx------ 2 root root 6 Apr 24 09:29 checkpoints
清空该容器控制台日志文件:
cat /dev/null > 0d7b200dd9209005759e20e3286a21b1435ff05c2179847a5f294deeb9bba52f-json.log
重启docker
和kubelet
服务后,Kubernetes集群恢复。
systemctl restart docker
systemctl restart kubelet