火花使用Windows功能支持 [英] Spark support for using Windows function
问题描述
我使用的火花版本1.6.0..while我使用的火花与python.I发现Windows功能不被火花的版本,我使用的,因为当我试图用windows功能支持我查询(使用sparksql)为你需要建立与我搜索各种各样的事情,我发现我需要使用火花版本1.4.0,我做到了,没有运气蜂巢functionality'.Following火花它给了我一个错误。有些帖子还建议建立与蜂巢functionality.But火花我没有发现这样做是正确的做法。结果
当使用火花1.4.0.I有以下错误。
提高ValueError错误(无效模式%R(只有R,W,B允许))
ValueError错误:无效的模式%R(仅R,W,B允许)
16/04/04 14时17分17秒WARN PythonRDD:未完成的任务中断:试图基尔加丹
升Python的工人
16/04/04 14时17分17秒INFO HadoopRDD:输入拆分:文件:/ C:/用户/测试
esktop /火花1.4.0彬hadoop2.4 /测试:910178 + 910178
16/04/04 14时17分17秒INFO执行人:执行人舞台1.0打死任务1.0(TID 2)
在第一阶段1.0(TID 2,失落的任务1.0 localh:16/04/04十四时17分17秒WARN TaskSetManager
OST):TaskKilled(故意杀死)
16/04/04 14时17分17秒INFO TaskSchedulerImpl:删除taskset的1.0,其任务有
全部建成后,从池
我觉得这是我第三次回答类似的问题:
- 在星火使用窗口功能。
- 窗口功能不工作Pyspark sqlcontext 。
视窗功能支持与HiveContext和不正规SQLContext。
关于如何建立与火花蜂巢的支持,答案是在官方的建筑星火文档:
与蜂巢和JDBC支持建设
要启用与JDBC服务器和CLI沿星火SQL蜂巢整合,添加-Phive和Phive-thriftserver配置文件到您现有的构建选项。默认情况下将星火建立与蜂巢0.13.1绑定。
的Apache Hadoop的使用2.4.X蜂巢13支持(例如):
MVN -Pyarn -Phadoop-2.4 -Dhadoop.version = 2.4.0 -Phive -Phive-thriftserver -DskipTests清洁套装
建筑斯卡拉2.11
要生产使用Scala 2.11编制了星火包,使用-Dscala 2.11属性:
./开发/ change-scala-version.sh 2.11
MVN -Pyarn -Phadoop-2.4 -Dscala 2.11 -DskipTests清洁套装
有魔法在这里,一切都在文档中
I am using spark version 1.6.0..while I am using spark with python.I found that windows function are not been supported by the version of the spark that I am using,as when I tried to use windows function in my query(using sparksql) it gave me an error as 'you need to build spark with hive functionality'.Following that I searched various things and found that I need to use spark version 1.4.0.,which I did with no luck.Some posts also suggested to build spark with hive functionality.But I did not found the right way to do it.
when used spark 1.4.0.I got the following error.
raise ValueError("invalid mode %r (only r, w, b allowed)")
ValueError: invalid mode %r (only r, w, b allowed)
16/04/04 14:17:17 WARN PythonRDD: Incomplete task interrupted: Attempting to kil
l Python Worker
16/04/04 14:17:17 INFO HadoopRDD: Input split: file:/C:/Users/test
esktop/spark-1.4.0-bin-hadoop2.4/test:910178+910178
16/04/04 14:17:17 INFO Executor: Executor killed task 1.0 in stage 1.0 (TID 2)
16/04/04 14:17:17 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2, localh
ost): TaskKilled (killed intentionally)
16/04/04 14:17:17 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have
all completed, from pool
I think that this is the third time that I answer a similar question :
Windows function are supported with HiveContext and not regular SQLContext.
Concerning how to build spark with hive support, the answer is in the official Building Spark documentation :
Building with Hive and JDBC Support To enable Hive integration for Spark SQL along with its JDBC server and CLI, add the -Phive and Phive-thriftserver profiles to your existing build options. By default Spark will build with Hive 0.13.1 bindings.
Apache Hadoop 2.4.X with Hive 13 support (example):
mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver -DskipTests clean package
Building for Scala 2.11
To produce a Spark package compiled with Scala 2.11, use the -Dscala-2.11 property:
./dev/change-scala-version.sh 2.11
mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package
There is magic here, everything is in the documentation.
这篇关于火花使用Windows功能支持的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!