Hadoop的HDFS与星火 [英] Hadoop's HDFS with Spark
问题描述
我是新来的集群运算,我试图建立在星火最小的2个节点的集群。
什么我还是有点困惑:我必须建立一个完整的Hadoop安装第一奥德确实星火船舶与包括的Hadoop版本内
I am new to cluster-computing and I am trying to set up a minimal 2-node cluster in Spark. What I am still a bit confused about: Do I have to set up a full Hadoop installation first oder does Spark ship with an included Hadoop version inside?
我找到火花的东西并没有真正使这一点。我了解,星火意味着作为扩展Hadoop的,而不是取代它,但如果它需要一个独立运行的Hadoop系统没有得到明确的给我。
The stuff I find about Spark does not really make it this clear. I understood that Spark is meant as an extension to Hadoop rather than replacing it, but if it requires an independently running Hadoop system does not get clear to me.
我需要一个HDFS,它是这样不够的,只是使用的Hadoop文件系统的一部分?
I require a HDFS, is it thus enough to just use the file-system part of Hadoop?
可能有人指出这可能是显而易见的事给我看?
Could someone point this probably obvious thing out to me?
推荐答案
Apache的Spark是独立于Hadoop的。星火允许你使用数据(包括HDFS)不同的来源,是可以在一个独立的集群中运行,或使用现有的资源管理框架(如纱,Mesos)的。
Apache Spark is independent from Hadoop. Spark allows you to use different sources of data (incl. HDFS) and is capable of running either in a standalone cluster, or using an existing resource management framework (eg. YARN, Mesos).
所以,如果你只在星火兴趣,有没有需要安装的Hadoop。
So if you're only interested in Spark, there is no need to install Hadoop.
这篇关于Hadoop的HDFS与星火的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!