如何格式化星火Python的输出 [英] How to reformat the Spark Python Output

查看：212 发布时间：2016/5/22 15:39:18 python apache-spark

本文介绍了如何格式化星火Python的输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

(u'142578', (u'The-North-side-9890', (u'   12457896', 45.0)))
(u'124578', (u'The-West-side-9091', (u'   14578217', 0.0)))

这是我从加入基于IDS这就像两个RDD了（键，（value_left，value_right））使用此星火加入。

，所以我希望能有像

The-North-side-9890,12457896,45.0
The-West-side-9091,14578217,0.0

为了这个，我尝试用以下code

for this i try with the following code

from pyspark import SparkContext
sc = SparkContext("local", "info")

file1 = sc.textFile('/home/hduser/join/part-00000').map(lambda line: line.split(','))
result = file1.map(lambda x: (x[1]+', '+x[2],float(x[3][:-3]))).reduceByKey(lambda a,b:a+b)
result = result.map(lambda x:x[0]+','+str(x[1]))
result = result.map(lambda x: x.lstrip('[(').rstrip(')]')).coalesce(1).saveAsTextFile("hdfs://localhost:9000/finalop")

但给我下面的输出

but giving me the following output

(u'The-North-side-9896',  (u'   12457896',0.0
(u'The-East-side-9876',  (u'  47125479',0.0

所以我要清理这个，我怎么能做到这一点。

so i want to clean this how could i do that

帮助我实现这一目标。

推荐答案

从此得到：

（u'142578'，（u'The北端-9890'（U'12457896'，45.0）））

这样：

的 - 北 - 侧9890,12457896,45.0

您需要使用：

result = result.map(lambda (k, (s, (n1, n2))): ','.join([s, str(int(n1)), str(float(n2))]))

这篇关于如何格式化星火Python的输出的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何格式化星火Python的输出 [英] How to reformat the Spark Python Output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何格式化星火Python的输出 [英] How to reformat the Spark Python Output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭