火花使用Windows功能支持 [英] Spark support for using Windows function

查看：171 发布时间：2016/5/22 16:40:28 python hadoop apache-spark

本文介绍了火花使用Windows功能支持的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用的火花版本1.6.0..while我使用的火花与python.I发现Windows功能不被火花的版本，我使用的，因为当我试图用windows功能支持我查询（使用sparksql）为你需要建立与我搜索各种各样的事情，我发现我需要使用火花版本1.4.0，我做到了，没有运气蜂巢functionality'.Following火花它给了我一个错误。有些帖子还建议建立与蜂巢functionality.But火花我没有发现这样做是正确的做法。结果
当使用火花1.4.0.I有以下错误。

 提高ValueError错误（无效模式％R（只有R，W，B允许））
ValueError错误：无效的模式％R（仅R，W，B允许）
16/04/04 14时17分17秒WARN PythonRDD：未完成的任务中断：试图基尔加丹
升Python的工人
16/04/04 14时17分17秒INFO HadoopRDD：输入拆分：文件：/ C：/用户/测试
esktop /火花1.4.0彬hadoop2.4 /测试：910178 + 910178
16/04/04 14时17分17秒INFO执行人：执行人舞台1.0打死任务1.0（TID 2）
在第一阶段1.0（TID 2，失落的任务1.0 localh：16/04/04十四时17分17秒WARN TaskSetManager
OST）：TaskKilled（故意杀死）
16/04/04 14时17分17秒INFO TaskSchedulerImpl：删除taskset的1.0，其任务有
全部建成后，从池

解决方案

我觉得这是我第三次回答类似的问题：

在星火使用窗口功能。

窗口功能不工作Pyspark sqlcontext 。

视窗功能支持与HiveContext和不正规SQLContext。

关于如何建立与火花蜂巢的支持，答案是在官方的建筑星火文档：

与蜂巢和JDBC支持建设
要启用与JDBC服务器和CLI沿星火SQL蜂巢整合，添加-Phive和Phive-thriftserver配置文件到您现有的构建选项。默认情况下将星火建立与蜂巢0.13.1绑定。

的Apache Hadoop的使用2.4.X蜂巢13支持（例如）：

  MVN -Pyarn -Phadoop-2.4 -Dhadoop.version = 2.4.0 -Phive -Phive-thriftserver -DskipTests清洁套装

建筑斯卡拉2.11

要生产使用Scala 2.11编制了星火包，使用-Dscala 2.11属性：

  ./开发/ change-scala-version.sh 2.11
MVN -Pyarn -Phadoop-2.4 -Dscala 2.11 -DskipTests清洁套装

有魔法在这里，一切都在文档中

I am using spark version 1.6.0..while I am using spark with python.I found that windows function are not been supported by the version of the spark that I am using,as when I tried to use windows function in my query(using sparksql) it gave me an error as 'you need to build spark with hive functionality'.Following that I searched various things and found that I need to use spark version 1.4.0.,which I did with no luck.Some posts also suggested to build spark with hive functionality.But I did not found the right way to do it.
when used spark 1.4.0.I got the following error.

raise ValueError("invalid mode %r (only r, w, b allowed)")
ValueError: invalid mode %r (only r, w, b allowed)
16/04/04 14:17:17 WARN PythonRDD: Incomplete task interrupted: Attempting to kil
l Python Worker
16/04/04 14:17:17 INFO HadoopRDD: Input split: file:/C:/Users/test
esktop/spark-1.4.0-bin-hadoop2.4/test:910178+910178
16/04/04 14:17:17 INFO Executor: Executor killed task 1.0 in stage 1.0 (TID 2)
16/04/04 14:17:17 WARN TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2, localh
ost): TaskKilled (killed intentionally)
16/04/04 14:17:17 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have
all completed, from pool

解决方案

I think that this is the third time that I answer a similar question :

Windows function are supported with HiveContext and not regular SQLContext.

Concerning how to build spark with hive support, the answer is in the official Building Spark documentation :

Building with Hive and JDBC Support To enable Hive integration for Spark SQL along with its JDBC server and CLI, add the -Phive and Phive-thriftserver profiles to your existing build options. By default Spark will build with Hive 0.13.1 bindings.

Apache Hadoop 2.4.X with Hive 13 support (example):

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver -DskipTests clean package

Building for Scala 2.11

To produce a Spark package compiled with Scala 2.11, use the -Dscala-2.11 property:

./dev/change-scala-version.sh 2.11
mvn -Pyarn -Phadoop-2.4 -Dscala-2.11 -DskipTests clean package

There is magic here, everything is in the documentation.

这篇关于火花使用Windows功能支持的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

火花使用Windows功能支持 [英] Spark support for using Windows function

问题描述

的Apache Hadoop的使用2.4.X蜂巢13支持（例如）：

建筑斯卡拉2.11

Apache Hadoop 2.4.X with Hive 13 support (example):

Building for Scala 2.11

相关文章

Python最新文章

热门教程

热门工具

登录关闭

火花使用Windows功能支持 [英] Spark support for using Windows function

问题描述

的Apache Hadoop的使用2.4.X蜂巢13支持（例如）：

建筑斯卡拉2.11

Apache Hadoop 2.4.X with Hive 13 support (example):

Building for Scala 2.11

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭