火花使用Windows功能支持 [英] Spark support for using Windows function

查看:171
本文介绍了火花使用Windows功能支持的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的火花版本1.6.0..while我使用的火花与python.I发现Windows功能不被火花的版本,我使用的,因为当我试图用windows功能支持我查询(使用sparksql)为你需要建立与我搜索各种各样的事情,我发现我需要使用火花版本1.4.0,我做到了,没有运气蜂巢functionality'.Following火花它给了我一个错误。有些帖子还建议建立与蜂巢functionality.But火花我没有发现这样做是正确的做法。结果
当使用火花1.4.0.I有以下错误。

 提高ValueError错误(无效模式%R(只有R,W,B允许))
ValueError错误:无效的模式%R(仅R,W,B允许)
16/04/04 14时17分17秒WARN PythonRDD:未完成的任务中断:试图基尔加丹
升Python的工人
16/04/04 14时17分17秒INFO HadoopRDD:输入拆分:文件:/ C:/用户/测试
esktop /火花1.4.0彬hadoop2.4 /测试:910178 + 910178
16/04/04 14时17分17秒INFO执行人:执行人舞台1.0打死任务1.0(TID 2)
在第一阶段1.0(TID 2,失落的任务1.0 localh:16/04/04十四时17分17秒WARN TaskSetManager
OST):TaskKilled(故意杀死)
16/04/04 14时17分17秒INFO TaskSchedulerImpl:删除taskset的1.0,其任务有
全部建成后,从池


解决方案

我觉得这是我第三次回答类似的问题:

视窗功能支持与HiveContext和不正规SQLContext。

关于如何建立与火花蜂巢的支持,答案是在官方的建筑星火文档

与蜂巢和JDBC支持建设
要启用与JDBC服务器和CLI沿星火SQL蜂巢整合,添加-Phive和Phive-thriftserver配置文件到您现有的构建选项。默认情况下将星火建立与蜂巢0.13.1绑定。

的Apache Hadoop的使用2.4.X蜂巢13支持(例如):

  MVN -Pyarn -Phadoop-2.4 -Dhadoop.version = 2.4.0 -Phive -Phive-thriftserver -DskipTests清洁套装

建筑斯卡拉2.11

要生产使用Scala 2.11编制了星火包,使用-Dscala 2.11属性:

  ./开发/ change-scala-version.sh 2.11
MVN -Pyarn -Phadoop-2.4 -Dscala 2.11 -DskipTests清洁套装

有魔法在这里,一切都在文档中

I am using spark version 1.6.0..while I am using spark with python.I found that windows function are not been supported by the version of the spark that I am using,as when I tried to use windows function in my query(using sparksql) it gave me an error as 'you need to build spark with hive functionality'.Following that I searched various things and found that I need to use spark version 1.4.0.,which I did with no luck.Some posts also suggested to build spark with hive functionality.But I did not found the right way to do it.
when used spark 1.4.0.I got the following error.

raise ValueError("invalid mode %r (only r, w, b allowed)")
ValueError: invalid mode %r (only r, w, b allowed)
16/04/04 14:17:17 WARN PythonRDD: Incomplete task interrupted: Attempting to kil
l Python Worker
16/04/04 14:17:17 INFO HadoopRDD: Input split: file:/C:/Users/test
esktop/spark-1.4.0-bin-hadoop2.4/test:910178+910178
16/04/04 14:17:17 INFO Executor: Executor killed task 1.0 in stage 1.0 (TID 2)
16/04/04 14:17:17 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2, localh
ost): TaskKilled (killed intentionally)
16/04/04 14:17:17 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have
all completed, from pool

解决方案

I think that this is the third time that I answer a similar question :

Windows function are supported with HiveContext and not regular SQLContext.

Concerning how to build spark with hive support, the answer is in the official Building Spark documentation :

Building with Hive and JDBC Support To enable Hive integration for Spark SQL along with its JDBC server and CLI, add the -Phive and Phive-thriftserver profiles to your existing build options. By default Spark will build with Hive 0.13.1 bindings.

Apache Hadoop 2.4.X with Hive 13 support (example):

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver -DskipTests clean package

Building for Scala 2.11

To produce a Spark package compiled with Scala 2.11, use the -Dscala-2.11 property:

./dev/change-scala-version.sh 2.11
mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package

There is magic here, everything is in the documentation.

这篇关于火花使用Windows功能支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆