为什么我们需要Hadoop无密码的ssh? [英] Why do we need Hadoop passwordless ssh?

查看:114
本文介绍了为什么我们需要Hadoop无密码的ssh?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

AFAIK,需要无密码的ssh,以便主节点可以在每个从节点上启动守护进程。除此之外,有没有使用无密码SSH Hadoop的操作?
  • 用户代码罐和数据块如何通过从节点传输?我想知道使用的机制和协议。
    无密码SSH只能配置为主从配对,甚至配置在奴隶之间吗?

  • 你是对的。如果ssh不是无密码的,你必须进入每台机器并手动启动那里的所有进程。
    对于第二个问题,HDFS中的所有通信都是通过TCP / IP发生的,而数据移动则使用HTTP。机制如下:


    客户端建立到
    NameNode机器上可配置TCP端口的连接。它与NameNode交谈ClientProtocol。
    DataNodes使用DataNode协议与NameNode进行通信。远程
    过程调用(RPC)抽象包含了客户端协议和
    数据节点协议。


    对于第三个问题,没有必要在从属节点中设置无密码的ssh。


    • AFAIK, passwordless ssh is needed so that the master node can start the daemon processes on each slave node. Apart from that, is there any use of having passwordless ssh for Hadoop's operation?

    • How are the user code jars and data chunks transferred across the slave nodes? I want to know the mechanism and the protocol used.

    • The passwordless SSH should ONLY be configured for master-slave pairs or even for amongst the slaves?

    解决方案

    You are correct. If ssh is not passwordless, you have to go on each individual machine and start all the processes there, manually. For your second question, all the communication in HDFS happens over TCP/IP and for the data movement HTTP is used. Mechanism goes like this :

    A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol.

    And for the third question, it's not necessary to have a passwordless ssh among the slave nodes.

    这篇关于为什么我们需要Hadoop无密码的ssh?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆