无法提取图片"velero/velero-plugin-for-gcp:v1.1.0&";在GKE群集中安装Velero时 [英] Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0" while installing Velero in GKE Cluster

查看:135
本文介绍了无法提取图片"velero/velero-plugin-for-gcp:v1.1.0&";在GKE群集中安装Velero时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试安装和配置Velero以进行kubernetes备份.我已按照链接在我的GKE集群中对其进行配置.安装正常,但是velero无法正常工作.

I'm trying to install and configure Velero for kubernetes backup. I have followed the link to configure it in my GKE cluster. The installation went fine, but velero is not working.

我正在使用Google Cloud Shell运行我的所有命令(我已经在我的Google Cloud Shell中安装并配置了velero客户端)

I am using google cloud shell for running all my commands (I have installed and configured velero client in my google cloud shell)

在进一步检查velero部署和velero吊舱时,我发现它无法从docker存储库中提取映像.

On further inspection on velero deployment and velero pods, I found out that it is not able to pull the image from the docker repository.

kubectl get pods -n velero
NAME                      READY   STATUS              RESTARTS   AGE
velero-5489b955f6-kqb7z   0/1     Init:ErrImagePull   0          20s

velero pod(kubectl描述pod)出错(为便于阅读,输出已编辑-仅在下面显示相关信息)

Error from velero pod (kubectl describe pod) (output redacted for readability - only relevant info shown below)

    Events:
  Type     Reason     Age               From                                                  Message
  ----     ------     ----              ----                                                  -------
  Normal   Scheduled  38s               default-scheduler                                     Successfully assigned velero/velero-5489b955f6-kqb7z to gke-gke-cluster1-default-pool-a354fba3-8674
  Warning  Failed     22s               kubelet, gke-gke-cluster1-default-pool-a354fba3-8674  Failed to pull image "velero/velero-plugin-for-gcp:v1.1.0": rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     22s               kubelet, gke-gke-cluster1-default-pool-a354fba3-8674  Error: ErrImagePull
  Normal   BackOff    21s               kubelet, gke-gke-cluster1-default-pool-a354fba3-8674  Back-off pulling image "velero/velero-plugin-for-gcp:v1.1.0"
  Warning  Failed     21s               kubelet, gke-gke-cluster1-default-pool-a354fba3-8674  Error: ImagePullBackOff
  Normal   Pulling    8s (x2 over 37s)  kubelet, gke-gke-cluster1-default-pool-a354fba3-8674  Pulling image "velero/velero-plugin-for-gcp:v1.1.0"

用于安装velero的命令:(某些值作为变量给出)

Command used to install velero: (some of the values are given as variables)

velero install \
     --provider gcp \
     --plugins velero/velero-plugin-for-gcp:v1.1.0 \
     --bucket $storagebucket \
     --secret-file ~/velero-backup-storage-sa-key.json

Velero版本

velero version
Client:
        Version: v1.4.2
        Git commit: 56a08a4d695d893f0863f697c2f926e27d70c0c5
<error getting server version: timed out waiting for server status request to be processed>

GKE版本

v1.15.12-gke.2

推荐答案

这不是私有集群吗? – mario 31分钟前

@mario这是一个私有集群,但是我可以部署其他服务而没有任何问题(例如:我已经成功部署了nginx)– Sreesan 15分钟前

@mario this is a private cluster but I can deploy other services without any issues (for eg: I have deployed nginx successfully) – Sreesan 15 mins ago

好吧,这是知道限制 > GKE私有集群 .您可以在文档中进行阅读:

Well, this is a know limitation of GKE Private Clusters. As you can read in the documentation:

无法从公共Docker Hub提取图像

症状

集群中运行的Pod在kubectl describe中显示警告,例如Failed to pull image: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

A Pod running in your cluster displays a warning in kubectl describe such as Failed to pull image: rpc error: code = Unknown desc = Error response from daemon: Get https://registry-1.docker.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

潜在原因

私有集群中的节点不具有对公众的出站访问权限 互联网.他们对Google API和服务的访问权限有限, 包括容器注册表.

Nodes in a private cluster do not have outbound access to the public internet. They have limited access to Google APIs and services, including Container Registry.

解决方案

您无法直接从Docker Hub获取图像.而是使用图片 托管在Container Registry上.请注意,虽然Container Registry的 Docker Hub 镜子 可从私有集群访问,但不应独占 依靠.镜像只是一个缓存,因此图像会定期 删除后,私有集群将无法使用Docker Hub.

You cannot fetch images directly from Docker Hub. Instead, use images hosted on Container Registry. Note that while Container Registry's Docker Hub mirror is accessible from a private cluster, it should not be exclusively relied upon. The mirror is only a cache, so images are periodically removed, and a private cluster is not able to fall back to Docker Hub.

您还可以将其与这个答案.

您可以通过简单的实验轻松地对其进行验证.尝试运行两个不同的nginx部署.第一个基于图像nginx(等于nginx:latest),第二个基于图像nginx:1.14.2.

It can be easily verified on your own by making a simple experiment. Try to run two different nginx deployments. First based on image nginx (which equals to nginx:latest) and the second one based on nginx:1.14.2.

虽然第一种情况是完全可行的,因为可以从专用群集中访问的 Container Registry的Docker Hub镜像中拉出nginx:latest映像,但是任何拉入nginx:1.14.2的尝试都将失败.您会在Pod事件中看到.发生这种情况是因为 kubelet 无法在 GCR 中找到该版本的映像,并且它试图将其从公共Docker注册表(https://registry-1.docker.io/v2/)中提取,而该映像在私有群集是不可能的. 镜像只是一个缓存,因此会定期删除图像,并且私有集群无法回退到Docker Hub." -您可以在文档中阅读.

While the first scenario is perfectly feasible because the nginx:latest image can be pulled from Container Registry's Docker Hub mirror which is accessible from a private cluster, any attempt of pulling nginx:1.14.2 will fail which you'll see in Pod events. It happens because the kubelet is not able to find this version of the image in GCR and it tries to pull it from public docker registry (https://registry-1.docker.io/v2/), which in Private Clusters is not possible. "The mirror is only a cache, so images are periodically removed, and a private cluster is not able to fall back to Docker Hub." - as you can read in docs.

如果仍然有疑问,只需ssh进入您的节点并尝试运行以下命令:

If you still have doubts, just ssh into your node and try to run following commands:

curl https://cloud.google.com/container-registry/

curl https://registry-1.docker.io/v2/

虽然第一个可以完美运行,但是第二个最终会失败:

While the first one works perfectly, the second one will eventually fail:

curl: (7) Failed to connect to registry-1.docker.io port 443: Connection timed out

原因? -私有群集中的节点没有对公共Internet的出站访问."

您可以搜索 GCR 中当前可用的内容这里.

You can search what is currently available in GCR here.

在许多情况下,如果您未指定确切的版本(默认使用latest标记),则应该能够获得所需的图像.虽然它可以帮助nginx,但遗憾的是,没有 velero/velero -plugin-for-gcp 当前在Google Container Registry的Docker Hub镜像中可用.

In many cases you should be able to get the required image if you don't specify it's exact version (by default latest tag is used). While it can help with nginx, unfortunatelly no version of velero/velero-plugin-for-gcp is currently available in Google Container Registry's Docker Hub mirror.

授予私有节点出站Internet使用云NAT 访问似乎是唯一可行的解​​决方案适用于您的情况.

Granting private nodes outbound internet access by using Cloud NAT seems the only reasonable solution that can be applied in your case.

这篇关于无法提取图片"velero/velero-plugin-for-gcp:v1.1.0&";在GKE群集中安装Velero时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆