如何将 Spark 升级到新版本? [英] How to upgrade Spark to newer version?
问题描述
我有一台装有 Spark 1.3
的虚拟机,但我想将其升级到 Spark 1.5
主要是由于 1.3 中没有的某些受支持的功能.我是否可以将 Spark
版本从 1.3
升级到 1.5
,如果是,那我该怎么做?
I have a virtual machine which has Spark 1.3
on it but I want to upgrade it to Spark 1.5
primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark
version from 1.3
to 1.5
and if yes then how can I do that?
推荐答案
预构建的 Spark 发行版,就像我相信您正在使用的基于 你的另一个问题,升级"相当简单,因为实际上并没有安装"Spark.实际上,您所要做的就是:
Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:
- 下载合适的 Spark 发行版(针对 Hadoop 2.6 及更高版本预先构建,在您的情况下)
- 将 tar 文件解压到适当的目录(即文件夹
spark-1.3.1-bin-hadoop2.6
已经在的位置) - 相应地更新您的
SPARK_HOME
(可能还有一些其他环境变量,具体取决于您的设置)
- Download the appropriate Spark distro (pre-built for Hadoop 2.6 and later, in your case)
- Unzip the tar file in the appropriate directory (i.e.where folder
spark-1.3.1-bin-hadoop2.6
already is) - Update your
SPARK_HOME
(and possibly some other environment variables depending on your setup) accordingly
这是我自己所做的,从 1.3.1 到 1.5.2,设置类似于您的设置(运行 Ubuntu 的流浪虚拟机):
Here is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):
1) 下载相应目录下的tar文件
1) Download the tar file in the appropriate directory
vagrant@sparkvm2:~$ cd $SPARK_HOME
vagrant@sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd ..
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema
ipcontroller ipengine2 ipython pygmentize
vagrant@sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
[...]
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6.tgz
ipcontroller ipengine2 ipython pygmentize
请注意,您应该与 wget
一起使用的确切镜像可能与我的不同,具体取决于您所在的位置;选择后,您将通过单击下载页面中的下载 Spark"链接来获取此信息要下载的包类型.
Notice that the exact mirror you should use with wget
will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.
2) 用
vagrant@sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6
ipcontroller ipengine2 ipython pygmentize spark-1.5.2-bin-hadoop2.6.tgz
你可以看到现在你有了一个新文件夹,spark-1.5.2-bin-hadoop2.6
.
You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6
.
3) 相应地更新 SPARK_HOME
(可能还有您正在使用的其他环境变量)以指向这个新目录而不是前一个目录.
3) Update accordingly SPARK_HOME
(and possibly other environment variables you are using) to point to this new directory instead of the previous one.
重新启动机器后,您应该完成了.
And you should be done, after restarting your machine.
请注意:
- 您不需要删除以前的 Spark 发行版,只要所有相关的环境变量都指向新的发行版即可.这样,您甚至可以在旧版本和新版本之间快速来回移动",以防您想测试一些东西(即您只需要更改相关的环境变量).
sudo
就我而言是必要的;根据您的设置,您可能不需要.- 在确保一切正常后,最好删除下载的
tgz
文件. - 您可以使用完全相同的过程升级到未来版本的 Spark,因为它们即将推出(速度相当快).如果这样做,请确保之前的
tgz
文件已被删除,或者修改上面的tar
命令以指向特定文件(即没有*
通配符如上).
- You don't need to remove the previous Spark distribution, as long as all the relevant environment variables point to the new one. That way, you may even quickly move "back-and-forth" between the old and new version, in case you want to test things (i.e. you just have to change the relevant environment variables).
sudo
was necessary in my case; it may be unnecessary for you depending on your settings.- After ensuring that everything works fine, it's good idea to delete the downloaded
tgz
file. - You can use the exact same procedure to upgrade to future versions of Spark, as they come out (rather fast). If you do this, either make sure that previous
tgz
files have been deleted, or modify thetar
command above to point to a specific file (i.e. no*
wildcards as above).
这篇关于如何将 Spark 升级到新版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!