我无法从Hadoop客户端访问Hadoop服务器 [英] I can't get through to Hadoop server from Hadoop client

查看:276
本文介绍了我无法从Hadoop客户端访问Hadoop服务器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hadoop服务器在Kubernetes中. Hadoop客户端位于外部网络上.因此,我尝试通过kubernetes-service使用Hadoop服务器.但是hadoop fs -put不适用于Hadoop客户端.据我所知,namenode将数据节点IP提供给Hadoop客户端.如果是,名称节点从何处获取IP?

Hadoop server is in Kubernetes. And the Hadoop client is located on an external network. So I try to use the Hadoop server using a kubernetes-service. But hadoop fs -put does not work for the Hadoop client. As I know, the namenode gives the datanode IP to Hadoop client. If yes, where does the namenode get IP from?

推荐答案

您可以查看我的其他答案. HDFS尚未在K8s中投入生产(截至撰写本文时)

You can check my other answer. HDFS is not production ready in K8s yet (as of this writing)

namenode为客户端提供数据节点的IP地址,并在它们加入群集时知道这些地址,如下所示:

The namenode gives the client the IP addresses of the datanodes and it knows those when they join the cluster as shown below:

K8s中的问题是,您必须将每个数据节点公开为服务或外部IP,但namenode会看到数据节点及其Pod IP地址,而这些IP地址对于外界是不可用的.另外,HDFS不提供发布IP对于每个数据节点配置,您可以在其中强制使用服务IP,因此您必须进行精美的自定义网络,否则您的客户端必须位于podCidr内(这违背了HDFS作为分布式文件系统的目的).

The issue in K8s is that you have to expose each data node as a service or external IP, but the namenode sees the datanodes with their pod IP addresses that are not available to the outside world. Also, HDFS doesn't provide a publish IP for each datanode config where you could force to use a service IP, so you'll have to do fancy custom networking or your client has to be inside the podCidr (Which kind of defeats the purpose of HDFS being a distributed filesystem).

这篇关于我无法从Hadoop客户端访问Hadoop服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆