kubernetes服务IP无法访问 [英] kubernetes service IPs not reachable

查看:603
本文介绍了kubernetes服务IP无法访问的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我已经在CoreOS上使用 Kubernetes启动并运行了kubernets集群.手动安装指南.

So I've got a kubernets cluster up and running using the Kubernetes on CoreOS Manual Installation Guide.

$ kubectl get no
NAME              STATUS                     AGE
coreos-master-1   Ready,SchedulingDisabled   1h
coreos-worker-1   Ready                      54m

$ kubectl get cs
NAME                 STATUS    MESSAGE              ERROR
controller-manager   Healthy   ok
scheduler            Healthy   ok
etcd-0               Healthy   {"health": "true"}
etcd-2               Healthy   {"health": "true"}
etcd-1               Healthy   {"health": "true"}

$ kubectl get pods --all-namespaces -o wide
NAMESPACE     NAME                                      READY     STATUS    RESTARTS   AGE       IP               NODE
default       curl-2421989462-h0dr7                     1/1       Running   1          53m       10.2.26.4        coreos-worker-1
kube-system   busybox                                   1/1       Running   0          55m       10.2.26.3        coreos-worker-1
kube-system   kube-apiserver-coreos-master-1            1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-controller-manager-coreos-master-1   1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-proxy-coreos-master-1                1/1       Running   0          1h        192.168.0.200   coreos-master-1
kube-system   kube-proxy-coreos-worker-1                1/1       Running   0          58m       192.168.0.204   coreos-worker-1
kube-system   kube-scheduler-coreos-master-1            1/1       Running   0          1h        192.168.0.200   coreos-master-1

$ kubectl get svc --all-namespaces
NAMESPACE   NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
default     kubernetes   10.3.0.1     <none>        443/TCP   1h

与指南一样,我已经设置了服务网络10.3.0.0/16和Pod网络10.2.0.0/16. Pod网络似乎很好,因为busybox和curl容器可以获取IP.但是服务网络有问题.最初,我在部署kube-dns时遇到了此问题:无法访问服务IP 10.3.0.1,因此kube-dns无法启动所有容器,并且DNS最终无法正常工作.

As with the guide, I've setup a service network 10.3.0.0/16 and a pod network 10.2.0.0/16. Pod network seems fine as busybox and curl containers get IPs. But the services network has problems. Originally, I've encountered this when deploying kube-dns: the service IP 10.3.0.1 couldn't be reached, so kube-dns couldn't start all containers and DNS was ultimately not working.

在curl吊舱内,我可以重现该问题:

From within the curl pod, I can reproduce the issue:

[ root@curl-2421989462-h0dr7:/ ]$ curl https://10.3.0.1
curl: (7) Failed to connect to 10.3.0.1 port 443: No route to host

[ root@curl-2421989462-h0dr7:/ ]$ ip route
default via 10.2.26.1 dev eth0
10.2.0.0/16 via 10.2.26.1 dev eth0
10.2.26.0/24 dev eth0  src 10.2.26.4

容器中只有默认路由似乎没问题.据我了解,请求(到默认路由)应该被工作节点上的kube-proxy拦截,转发给主节点上的代理,该主节点上的IP通过iptables转换为主公共IP.

It seems ok that there's only a default route in the container. As I understood it, the request (to default route) should be intercepted by the kube-proxy on the worker node, forwarded to the the proxy on the master node where the IP is translated via iptables to the masters public IP.

bridge/netfilter sysctl设置似乎存在一个常见问题,但是在我的设置中似乎没问题:

There seems to be a common problem with a bridge/netfilter sysctl setting, but that seems fine in my setup:

core@coreos-worker-1 ~ $ sysctl net.bridge.bridge-nf-call-iptables
net.bridge.bridge-nf-call-iptables = 1

由于我对服务IP的用途,服务网络在流量方面的工作方式以及如何进行最佳调试的了解不足,因此我很难进行故障排除.

I'm having a real hard time to troubleshoot, as I lack the understanding of what the service IP is used for, how the service network is supposed to work in terms of traffic flow and how to best debug this.

所以这是我的问题:

  • 服务网络的第一个IP(在这种情况下为10.3.0.1)用于什么?
  • 以上对交通流的描述正确吗?如果没有,那么容器要到达服务IP会采取什么步骤?
  • 调试流量中每个步骤的最佳方法是什么? (我不知道日志有什么问题)

谢谢!

推荐答案

Sevice网络为服务提供了固定的IP.它不是可路由的网络(因此,不要期望ip ro会显示任何内容,也不会对ping起作用),而是由kube-proxy在每个节点上管理的集合iptables规则(请参见节点上的iptables -L; iptables -t nat -L,而不是Pods).这些虚拟IP (请参见图片! )充当端点(kubectl get ep)的负载平衡代理,该端点通常是Pod的端口(但并非总是如此),具有服务定义的一组特定标签.

The Sevice network provides fixed IPs for Services. It is not a routeable network (so don't expect ip ro to show anything nor will ping work) but a collection iptables rules managed by kube-proxy on each node (see iptables -L; iptables -t nat -L on the nodes, not Pods). These virtual IPs (see the pics!) act as load balancing proxy for endpoints (kubectl get ep), which are usually ports of Pods (but not always) with a specific set of labels as defined in the Service.

服务网络上的第一个IP用于到达kube-apiserver本身.它正在侦听端口443(kubectl describe svc kubernetes).

The first IP on the Service network is for reaching the kube-apiserver itself. It's listening on port 443 (kubectl describe svc kubernetes).

每个网络/群集设置的故障排除方法都不相同.我通常会检查:

Troubleshooting is different on each network/cluster setup. I would generally check:

  • kube-proxy是否在每个节点上运行?在某些设置中,它是通过systemd运行的,而在另一些设置中,则有一个DeamonSet来安排每个节点上的Pod.在您的设置中,它被部署为由/etc/kubernetes/manifests/kube-proxy.yaml
  • 的kubelets实体创建的静态Pod.
  • 找到kube-proxy的日志并找到线索(您可以发布一些线索吗?)
  • 将kube-proxy更改为userspace模式.同样,详细信息取决于您的设置.对您来说,它在我上面提到的文件中.将--proxy-mode=userspace作为参数附加到每个节点上
  • 覆盖网络(pod)是否起作用?
  • Is kube-proxy running on each node? On some setups it's run via systemd and on others there is a DeamonSet that schedules a Pod on each node. On your setup it is deployed as static Pods created by the kubelets thrmselves from /etc/kubernetes/manifests/kube-proxy.yaml
  • Locate logs for kube-proxy and find clues (can you post some?)
  • Change kube-proxy into userspace mode. Again, the details depend on your setup. For you it's in the file I mentioned above. Append --proxy-mode=userspace as a parameter on each node
  • Is the overlay (pod) network functional?

如果您有任何评论,我会尽快回复您.

If you leave comments I will get back to you..

这篇关于kubernetes服务IP无法访问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆