是否可以在没有Hadoop的情况下将Spark用于开发环境? [英] Can I use Spark without Hadoop for development environment?
问题描述
我对大数据和相关领域的概念非常陌生,如果我犯了一些错误或错别字,对不起.
I'm very new to the concepts of Big Data and related areas, sorry if I've made some mistake or typo.
我想了解 Apache Spark 并仅在我的计算机上使用它在开发/测试环境中.当Hadoop包含HDFS(Hadoop分布式文件系统)和其他仅对分布式系统重要的软件时,我可以将其丢弃吗?如果是这样,我在哪里可以下载不需要Hadoop的Spark版本? 在这里我只能找到Hadoop依赖版本.
I would like to understand Apache Spark and use it only in my computer, in a development / test environment. As Hadoop include HDFS (Hadoop Distributed File System) and other softwares that only matters to distributed systems, can I discard that? If so, where can I download a version of Spark that doesn't need Hadoop? Here I can find only Hadoop dependent versions.
- 在单台计算机(我的家用计算机)中运行Spark的所有功能都没有问题.
- 我使用Spark在计算机上所做的所有操作均应在以后的群集中运行,而不会出现问题.
如果我出于测试目的在计算机上运行Hadoop,则有理由将Hadoop或任何其他分布式文件系统用于Spark吗?
There's reason to use Hadoop or any other distributed file system for Spark if I will run it on my computer for testing purposes?
请注意," apache spark是否可以在没有hadoop的情况下运行?"与我的问题不同,因为我确实想在开发环境中运行Spark.
Note that "Can apache spark run without hadoop?" is a different question from mine, because I do want run Spark in a development environment.
推荐答案
是的,您可以在没有Hadoop的情况下安装Spark. 查看Spark官方文档: http://spark.apache.org/docs/Latest/spark-standalone.html
Yes you can install Spark without Hadoop. Go through Spark official documentation :http://spark.apache.org/docs/latest/spark-standalone.html
粗略的步骤:
- 下载预先完成的spark或下载spark源并在本地构建
- 提取TAR
- 设置所需的环境变量
- 运行启动脚本.
Spark(不带Hadoop)-Avaialble在Spark下载页面 网址:
Spark(without Hadoop) - Avaialble on Spark Download page URL : https://www.apache.org/dyn/closer.lua/spark/spark-2.2.0/spark-2.2.0-bin-hadoop2.7.tgz
如果此网址不起作用,请尝试从Spark下载页面获取它
If this url do not work then try to get it from Spark download page
这篇关于是否可以在没有Hadoop的情况下将Spark用于开发环境?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!