为什么我们需要 Hadoop 无密码 ssh? [英] Why do we need Hadoop passwordless ssh?

查看:42
本文介绍了为什么我们需要 Hadoop 无密码 ssh?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • AFAIK,需要无密码 ssh,以便主节点可以在每个从节点上启动守护进程.除此之外,Hadoop 的操作是否有无密码 ssh 的用途?

  • AFAIK, passwordless ssh is needed so that the master node can start the daemon processes on each slave node. Apart from that, is there any use of having passwordless ssh for Hadoop's operation?

用户代码 jar 和数据块如何在从节点之间传输?我想知道使用的机制和协议.

How are the user code jars and data chunks transferred across the slave nodes? I want to know the mechanism and the protocol used.

无密码的 SSH 应该只为主从对配置,甚至在从属之间配置?

The passwordless SSH should ONLY be configured for master-slave pairs or even for amongst the slaves?

推荐答案

你是对的.如果 ssh 不是无密码的,您必须在每台机器上手动启动所有进程.对于第二个问题,HDFS 中的所有通信都通过 TCP/IP 进行,并且使用 HTTP 进行数据移动.机制是这样的:

You are correct. If ssh is not passwordless, you have to go on each individual machine and start all the processes there, manually. For your second question, all the communication in HDFS happens over TCP/IP and for the data movement HTTP is used. Mechanism goes like this :

客户端建立到一个可配置的 TCP 端口的连接NameNode 机器.它与 NameNode 对话 ClientProtocol.这DataNode 使用 DataNode 协议与 NameNode 对话.遥控器过程调用 (RPC) 抽象包装了客户端协议和数据节点协议.

A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol.

而对于第三个问题,从节点之间不必有无密码的ssh.

And for the third question, it's not necessary to have a passwordless ssh among the slave nodes.

这篇关于为什么我们需要 Hadoop 无密码 ssh?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆