如何将 Spark 升级到新版本? [英] How to upgrade Spark to newer version?

查看:62
本文介绍了如何将 Spark 升级到新版本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一台装有 Spark 1.3 的虚拟机,但我想将其升级到 Spark 1.5 主要是由于 1.3 中没有的某些受支持的功能.我是否可以将 Spark 版本从 1.3 升级到 1.5,如果是,那我该怎么做?

I have a virtual machine which has Spark 1.3 on it but I want to upgrade it to Spark 1.5 primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark version from 1.3 to 1.5 and if yes then how can I do that?

推荐答案

预构建的 Spark 发行版,就像我相信您正在使用的基于 你的另一个问题,升级"相当简单,因为实际上并没有安装"Spark.实际上,您所要做的就是:

Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:

  • 下载合适的 Spark 发行版(针对 Hadoop 2.6 及更高版本预先构建,在您的情况下)
  • 将 tar 文件解压到适当的目录(即文件夹 spark-1.3.1-bin-hadoop2.6 已经在的位置)
  • 相应地更新您的 SPARK_HOME(可能还有一些其他环境变量,具体取决于您的设置)
  • Download the appropriate Spark distro (pre-built for Hadoop 2.6 and later, in your case)
  • Unzip the tar file in the appropriate directory (i.e.where folder spark-1.3.1-bin-hadoop2.6 already is)
  • Update your SPARK_HOME (and possibly some other environment variables depending on your setup) accordingly

这是我自己所做的,从 1.3.1 到 1.5.2,设置类似于您的设置(运行 Ubuntu 的流浪虚拟机):

Here is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):

1) 下载相应目录下的tar文件

1) Download the tar file in the appropriate directory

vagrant@sparkvm2:~$ cd $SPARK_HOME
vagrant@sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd ..
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster     ipcontroller2  iptest   ipython2    spark-1.3.1-bin-hadoop2.6
ipcluster2    ipengine       iptest2  jsonschema
ipcontroller  ipengine2      ipython  pygmentize
vagrant@sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
[...]
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster     ipcontroller2  iptest   ipython2    spark-1.3.1-bin-hadoop2.6
ipcluster2    ipengine       iptest2  jsonschema  spark-1.5.2-bin-hadoop2.6.tgz
ipcontroller  ipengine2      ipython  pygmentize

请注意,您应该与 wget 一起使用的确切镜像可能与我的不同,具体取决于您所在的位置;选择后,您将通过单击下载页面中的下载 Spark"链接来获取此信息要下载的包类型.

Notice that the exact mirror you should use with wget will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.

2) 用

vagrant@sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz
vagrant@sparkvm2:/usr/local/bin$ ls
ipcluster     ipcontroller2  iptest   ipython2    spark-1.3.1-bin-hadoop2.6
ipcluster2    ipengine       iptest2  jsonschema  spark-1.5.2-bin-hadoop2.6
ipcontroller  ipengine2      ipython  pygmentize  spark-1.5.2-bin-hadoop2.6.tgz

你可以看到现在你有了一个新文件夹,spark-1.5.2-bin-hadoop2.6.

You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6.

3) 相应地更新 SPARK_HOME(可能还有您正在使用的其他环境变量)以指向这个新目录而不是前一个目录.

3) Update accordingly SPARK_HOME (and possibly other environment variables you are using) to point to this new directory instead of the previous one.

重新启动机器后,您应该完成了.

And you should be done, after restarting your machine.

请注意:

  1. 您不需要删除以前的 Spark 发行版,只要所有相关的环境变量都指向新的发行版即可.这样,您甚至可以在旧版本和新版本之间快速来回移动",以防您想测试一些东西(即您只需要更改相关的环境变量).
  2. sudo 就我而言是必要的;根据您的设置,您可能不需要.
  3. 在确保一切正常后,最好删除下载的 tgz 文件.
  4. 您可以使用完全相同的过程升级到未来版本的 Spark,因为它们即将推出(速度相当快).如果这样做,请确保之前的 tgz 文件已被删除,或者修改上面的 tar 命令以指向特定文件(即没有 * 通配符如上).
  1. You don't need to remove the previous Spark distribution, as long as all the relevant environment variables point to the new one. That way, you may even quickly move "back-and-forth" between the old and new version, in case you want to test things (i.e. you just have to change the relevant environment variables).
  2. sudo was necessary in my case; it may be unnecessary for you depending on your settings.
  3. After ensuring that everything works fine, it's good idea to delete the downloaded tgz file.
  4. You can use the exact same procedure to upgrade to future versions of Spark, as they come out (rather fast). If you do this, either make sure that previous tgz files have been deleted, or modify the tar command above to point to a specific file (i.e. no * wildcards as above).

这篇关于如何将 Spark 升级到新版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆