无法访问 kubernetes 服务 IP [英] kubernetes service IPs not reachable

查看:43
本文介绍了无法访问 kubernetes 服务 IP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我已经使用 Kubernetes 启动并运行了 Kubernetes 集群CoreOS 手动安装指南.

$ kubectl get no
NAME              STATUS                     AGE
coreos-master-1   Ready,SchedulingDisabled   1h
coreos-worker-1   Ready                      54m

$ kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}
etcd-2               Healthy   {"health": "true"}
etcd-1               Healthy   {"health": "true"}

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                      READY     STATUS    RESTARTS   AGE       IP               NODE
default       curl-2421989462-h0dr7                     1/1       Running   1          53m       10.2.26.4        coreos-worker-1
kube-system   busybox                                   1/1       Running   0          55m       10.2.26.3        coreos-worker-1
kube-system   kube-apiserver-coreos-master-1            1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-controller-manager-coreos-master-1   1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-proxy-coreos-master-1                1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-proxy-coreos-worker-1                1/1       Running   0          58m       192.168.0.204   coreos-worker-1
kube-system   kube-scheduler-coreos-master-1            1/1       Running   0          1h        192.168.0.200   coreos-master-1

$ kubectl get svc --all-namespaces
NAMESPACE   NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
default     kubernetes   10.3.0.1     <none>        443/TCP   1h

与指南一样,我已经设置了一个服务网络 10.3.0.0/16 和一个 pod 网络 10.2.0.0/16.Pod 网络看起来很好,因为 busybox 和 curl 容器获得了 IP.但是服务网络有问题.原来我在部署kube-dns时遇到过这个:服务IP10.3.0.1无法访问,所以kube-dns无法启动所有容器,DNS 最终无法正常工作.

As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.

从 curl pod 中,我可以重现该问题:

From within the curl pod, I can reproduce the issue:

[ root@curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host

[ root@curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0  src 10.2.26.4

容器中只有一个默认路由似乎没问题.据我了解,请求(到默认路由)应该被工作节点上的 kube-proxy 拦截,转发到主节点上的代理,在那里 IP 通过 iptables 转换为掌握公网IP.

It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.

bridge/netfilter sysctl 设置似乎存在一个常见问题,但在我的设置中似乎没有问题:

There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:

core@coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1

我很难排除故障,因为我不了解服务 IP 的用途、服务网络在流量方面的工作方式以及如何最好地调试.

I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.

以下是我的问题:

  • 服务网络的第一个 IP(在本例中为 10.3.0.1)用于什么?
  • 以上对交通流的描述是否正确?如果不是,容器到达服务IP需要哪些步骤?
  • 调试流量中每个步骤的最佳方法是什么?(我不知道日志有什么问题)

谢谢!

推荐答案

服务网络为服务提供固定 IP.它不是一个可路由的网络(所以不要指望 ip ro 显示任何内容,也不会 ping 工作),而是由 kube-proxy 在每个节点上管理的一个集合 iptables 规则(请参阅 iptables -L; iptables -t nat -L 在节点上,而不是 Pod).这些虚拟IP(见图片!) 充当端点 (kubectl get ep) 的负载平衡代理,这些端点通常是 Pod 的端口(但并非总是如此),具有服务中定义的一组特定标签.

The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.

Service 网络上的第一个 IP 用于访问 kube-apiserver 本身.它正在侦听端口 443(kubectl describe svc kubernetes).

The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).

每个网络/集群设置的故障排除是不同的.我通常会检查:

Troubleshooting is different on each network/cluster setup. I would generally check:

  • kube-proxy 是否在每个节点上运行?在某些设置中,它通过 systemd 运行,而在其他设置中,有一个 DeamonSet 在每个节点上调度一个 Pod.在您的设置中,它被部署为由来自 /etc/kubernetes/manifests/kube-proxy.yaml
  • 的 kubelets 自己创建的静态 Pod
  • 找到 kube-proxy 的日志并找到线索(你能发布一些吗?)
  • 将 kube-proxy 更改为 userspace 模式.同样,详细信息取决于您的设置.对你来说,它在我上面提到的文件中.附加 --proxy-mode=userspace 作为参数在每个节点
  • 覆盖(pod)网络是否正常工作?
  • Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
  • Locate logs for kube-proxy and find clues (can you post some?)
  • Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
  • Is the overlay (pod) network functional?

如果你留下评论,我会回复你..

If you leave comments I will get back to you..

这篇关于无法访问 kubernetes 服务 IP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆