在《kubernetes权威指南》入门的一个例子中,发现pod一直处于ContainerCreating
的状态,用kubectl describe pod mysql
的时候发现如下报错:
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1h 24m 17 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ErrImagePull: "image pull failed for registry.access.redhat.com/rhel7/pod-infrastructure:latest, this may be because there are no credentials on this request. details: (open /etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt: no such file or directory)"
1h 19m 291 {kubelet 127.0.0.1} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "POD" with ImagePullBackOff: "Back-off pulling image \"registry.access.redhat.com/rhel7/pod-infrastructure:latest\""
15m 15m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
15m 15m 1 {kubelet 127.0.0.1} spec.containers{mysql} Normal Pulling pulling image "mysql"
7m 7m 1 {kubelet 127.0.0.1} Warning MissingClusterDNS kubelet does not have ClusterDNS IP configured and cannot create Pod using "ClusterFirst" policy. Falling back to DNSDefault policy.
7m 7m 1 {kubelet 127.0.0.1} spec.containers{mysql} Normal Pulling pulling image "mysql"
问题是比较明显的,就是没有/etc/docker/certs.d/registry.access.redhat.com/redhat-ca.crt
文件,用ls -l
查看之后发现是一个软链接,链接到/etc/rhsm/ca/redhat-uep.pem
,但是这个文件不存在,使用yum search *rhsm*
命令:
- 安装
python-rhsm-certificates
包:
# yum install python-rhsm-certificates -y
这里又出现问题了:
python-rhsm-certificates <= 1.20.3-1 被 (已安裝) subscription-manager-rhsm-certificates-1.20.11-1.el7.centos.x86_64 取代
那么怎么办呢,我们直接卸载掉subscription-manager-rhsm-certificates
包,使用yum remove subscription-manager-rhsm-certificates -y
命令,然后下载python-rhsm-certificates
包:
# wget http://mirror.centos.org/centos/7/os/x86_64/Packages/python-rhsm-certificates-1.19.10-1.el7_4.x86_64.rpm
然后手动安装该rpm包:
# rpm -ivh python-rhsm-certificates
这时发现/etc/rhsm/ca/redhat-uep.pem
文件已存在。
- 使用
docker pull registry.access.redhat.com/rhel7/pod-infrastructure:latest
systemctl status docker
:命令下载镜像,但是可能会很慢,可以到https://dashboard.daocloud.io网站上注册账号,然后点击加速器,然后复制代码执行,之后重启docker就会进行加速,如果重启docker服务的时候无法启动,使用
# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since 一 2018-05-28 22:13:37 CST; 13s ago
Docs: http://docs.docker.com
Process: 79849 ExecStart=/usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt native.cgroupdriver=systemd --userland-proxy-path=/usr/libexec/docker/docker-proxy-current --init-path=/usr/libexec/docker/docker-init-current --seccomp-profile=/etc/docker/seccomp.json $OPTIONS $DOCKER_STORAGE_OPTIONS $DOCKER_NETWORK_OPTIONS $ADD_REGISTRY $BLOCK_REGISTRY $INSECURE_REGISTRY $REGISTRIES (code=exited, status=1/FAILURE)
Main PID: 79849 (code=exited, status=1/FAILURE)
5月 28 22:13:37 kube.example.com systemd[1]: Starting Docker Application Container Engine...
5月 28 22:13:37 kube.example.com dockerd-current[79849]: unable to configure the Docker daemon with file /etc/docker/daemon.json: invalid character '}' loo...y string
5月 28 22:13:37 kube.example.com systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE
5月 28 22:13:37 kube.example.com systemd[1]: Failed to start Docker Application Container Engine.
5月 28 22:13:37 kube.example.com systemd[1]: Unit docker.service entered failed state.
5月 28 22:13:37 kube.example.com systemd[1]: docker.service failed.
Hint: Some lines were ellipsized, use -l to show in full
这时将/etc/docker/seccomp.json
删除,再次重启即可
- 这时将之前创建的rc、svc和pod全部删除重新创建,过一会就会发现pod启动成功
原因猜想:根据报错信息,pod启动需要
registry.access.redhat.com/rhel7/pod-infrastructure:latest
镜像,需要去红帽仓库里下载,但是没有证书,安装证书之后就可以了