星火1.4图像谷歌云? [英] Spark 1.4 image for Google Cloud?

查看:314
本文介绍了星火1.4图像谷歌云?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通过bdutil,我能找到tar文件的最新版本是1.3.1火花:

With bdutil, the latest version of tarball I can find is on spark 1.3.1:

GS://spark-dist/spark-1.3.1-bin-hadoop2.6.tgz

gs://spark-dist/spark-1.3.1-bin-hadoop2.6.tgz

有在星火1.4了一些新的数据帧的功能,我想用。任何机会星火1.4图像可供bdutil,或任何解决方法吗?

There are a few new DataFrame features in Spark 1.4 that I want to use. Any chance the Spark 1.4 image be available for bdutil, or any workaround?

更新:

继安格斯·戴维斯建议,我下载并指出火花1.4.1彬hadoop2.6.tgz,部署进展顺利;但是,调用SqlContext.parquetFile时遇到错误()。我无法解释为什么这个例外是可能的,GoogleHadoopFileSystem应该是org.apache.hadoop.fs.FileSystem的子类。将继续调查这一点。

Following the suggestion from Angus Davis, I downloaded and pointed to spark-1.4.1-bin-hadoop2.6.tgz, the deployment went well; however, run into error when calling SqlContext.parquetFile(). I cannot explain why this exception is possible, GoogleHadoopFileSystem should be a subclass of org.apache.hadoop.fs.FileSystem. Will continue investigate on this.

Caused by: java.lang.ClassCastException: com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem cannot be cast to org.apache.hadoop.fs.FileSystem
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2595)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:112)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:144)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:504)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)

在被问及例外<一个一个单独的问题href=\"http://stackoverflow.com/questions/31478955/googlehadoopfilesystem-cannot-be-cast-to-hadoop-filesystem\">here

更新:

错误竟然是一个星火缺陷;在对上述问题提供了分辨率/变通办法。

The error turned out to be a Spark defect; resolution/workaround provided in the above question.

谢谢!

海鹰

推荐答案

如果当地的解决方法是可以接受的,你可以从Apache镜面火花1.4.1彬hadoop2.6.tgz复制到水桶你控制。然后,您可以编辑扩展/火花/ spark-env.sh和更改SPARK_HADOOP2_TARBALL_URI ='&lt;您的火花1.4.1&GT复印件; (使某些服务帐户运行的虚拟机有权读取压缩包)。

If a local workaround is acceptable, you can copy the spark-1.4.1-bin-hadoop2.6.tgz from an apache mirror into a bucket that you control. You can then edit extensions/spark/spark-env.sh and change SPARK_HADOOP2_TARBALL_URI='<your copy of spark 1.4.1>' (make certain that the service account running your VMs has permission to read the tarball).

请注意,我没有做过的任何的测试,看看1.4.1星火开箱的权利,但我想,如果你决定给它有兴趣听你的经历一去。

Note that I haven't done any testing to see if Spark 1.4.1 works out of the box right now, but I'd be interested in hearing your experience if you decide to give it a go.

这篇关于星火1.4图像谷歌云?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆