kubenates 运行任务时出现 can't join IPC of container...non-shareable IPC 的错误

kubenates 运行任务时出现 can't join IPC of container...non-shareable IPC 的错误

背景介绍

昨日,在搭建号 kubenate 集群后,尝试运行官方示例以熟悉 k8s 的相关操作。运行环境如下

应用版本

  • 系统版本
root@k8s-mst:/# uname -a
Linux k8s-mst 4.15.0-70-generic #79-Ubuntu SMP Tue Nov 12 10:36:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • kubenate 版本
root@k8s-mst:/# kubectl version
Client Version: version.Info{Major:"1", Minor:"5+", GitVersion:"v1.5.9-beta.0-dirty", GitCommit:"f35802d3a00b37a32476451266af05ce9760fec0", GitTreeState:"dirty", BuildDate:"2019-11-13T06:51:04Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/amd64"}
  • docker 版本
root@k8s-nod1:/# docker version
Client: Docker Engine - Community
 Version:           19.03.4
 API version:       1.40
 Go version:        go1.12.10
 Git commit:        9013bf583a
 Built:             Fri Oct 18 15:54:09 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.4
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.10
  Git commit:       9013bf583a
  Built:            Fri Oct 18 15:52:40 2019
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.10
  GitCommit:        b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version:          1.0.0-rc8+dev
  GitCommit:        3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

错误信息

  1. 部署成功创建,但在使用 kubectl get pod 查看 pod 的运行状态时出现异常。
root@k8s-mst:~/web_sample# kubectl apply -f nginx-deployment.yaml
deployment "nginx-deployment" created
root@k8s-mst:~/web_sample# kubectl get pod
NAME                                READY     STATUS              RESTARTS   AGE
nginx-deployment-4087004473-4m0br   0/1       RunContainerError   0          7s
nginx-deployment-4087004473-80rqd   0/1       RunContainerError   0          7s
nginx-deployment-4087004473-qfdmw   0/1       RunContainerError   0          7s
  1. 使用 kubectl describe pod 查看详细的错误信息,如下:
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath           Type            Reason          Message
  ---------     --------        -----   ----                    -------------           --------        ------          -------
  34s           34s             1       {default-scheduler }                            Normal          Scheduled       Successfully assigned nginx-deployment-4087004473-qfdmw to k8s-nod2
  31s           31s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Normal          Created         Created container with docker id 7fc4f63e0bd6; Security:[seccomp=unconfined]
  30s           30s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Warning         Failed          Failed to start container with docker id 7fc4f63e0bd6 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
  28s           28s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Normal          Created         Created container with docker id 3c61a9aaa6a4; Security:[seccomp=unconfined]
  28s           28s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Warning         Failed          Failed to start container with docker id 3c61a9aaa6a4 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
  18s           18s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Normal          Created         Created container with docker id 42970a39a9b4; Security:[seccomp=unconfined]
  17s           17s             1       {kubelet k8s-nod2}      spec.containers{nginx}  Warning         Failed          Failed to start container with docker id 42970a39a9b4 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
  32s           3s              4       {kubelet k8s-nod2}      spec.containers{nginx}  Normal          Pulled          Container image "nginx:1.7.9" already present on machine
  30s           2s              4       {kubelet k8s-nod2}                              Warning         FailedSync      Error syncing pod, skipping: failed to "StartContainer" for "nginx" with RunContainerError: "runContainer: Error response from daemon: {\"message\":\"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)\"}"

  2s    2s      1       {kubelet k8s-nod2}      spec.containers{nginx}  Normal  Created Created container with docker id 20ca9e358257; Security:[seccomp=unconfined]
  2s    2s      1       {kubelet k8s-nod2}      spec.containers{nginx}  Warning Failed  Failed to start container with docker id 20ca9e358257 with error: Error response from daemon: {"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"}
  1. 根据日志信息可知,错误是由 IPC mode 导致的,并且也给出了解决方案:使用 shareable 模式
"message":"can't join IPC of container 2dfd13510b3f816ed0437ea70a8a57daeaa7a38eec0035972044b25dd102f1cd: non-shareable IPC (hint: use IpcMode:shareable for the donor container)"

解决方案

  1. 由于网络上没有找到直接的解决方案,只能自己摸索。

  2. 首先,通过搜索 docker ipc,在 docker run reference 中找到了一些蛛丝马迹, 初步确认是由于 ipc 的模式被默认设置为 private 导致的

  3. 通过在 docker docs 中搜索 docker default ipc,在 docker 19.03 release note 中提到了原因,看来都是版本惹的祸(哭了

daemon: Now use ‘private’ ipc mode by default. moby/moby#35621
  1. 通过链接来到 github 的 pr 提交记录上,找到了解决方法
Old (bad, but backward-compatible) behavior (i.e. "shareable" containers by default) can be enabled by either using --default-ipc-mode shareable daemon command line option, or by adding a "default-ipc-mode": shareable" line in docker.json configuration file.
  1. 根据提示,修改 / etc/docker/daemon.json(没有的话新建一个),添加如下内容
{
  "default-ipc-mode": "shareable"
}
  1. 重启 docker
systemctl restart docker

完结撒花

  1. 重新部署 deployment
root@k8s-mst:~/web_sample# kubectl apply -f nginx-deployment.yaml
deployment "nginx-deployment" created
root@k8s-mst:~/web_sample# kubectl get pods
NAME                                READY     STATUS    RESTARTS   AGE
nginx-deployment-4087004473-9wjbr   1/1       Running   0          29s
nginx-deployment-4087004473-g9qpr   1/1       Running   0          29s
nginx-deployment-4087004473-z6jkm   1/1       Running   0          29s