从python输出中删除字符 [英] Removing Characters from python Output

查看：82 发布时间：2021/4/8 20:25:23 python apache-spark

本文介绍了从python输出中删除字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我做了很多工作，以从Spark python输出中删除字符，例如 u u'u"[()/'" ，这给我做进一步的工作带来了麻烦.因此，请重点关注相同的内容.

I did alot of work to remove the characters from the spark python output like u u' u" [()/'" which are creating problem for me to do the further work. So please put a focus on the same .

我有类似的输入

(u"(u'[25145,   12345678'", 0.0)
(u"(u'[25146,   25487963'", 43.0) when i applied code to summing out the result. this gives me the output like
(u'(u"(u\'[54879,    5125478\'"', 0.0)
(u"(u'[25145,   25145879'", 11.0)
(u'(u"(u\'[56897,    22548793\'"', 0.0) so i want to remove all the character like (u'(u"(u\'["'')

我想要类似

54879,5125478,0.0

25145,25145879,11.0

我尝试过的代码是

from pyspark import SparkContext
import os
import sys

sc = SparkContext("local", "aggregate")

file1 = sc.textFile("hdfs://localhost:9000/data/first/part-00000")
file2 = sc.textFile("hdfs://localhost:9000/data/second/part-00000")

file3 = file1.union(file2).coalesce(1).map(lambda line: line.split(','))

result = file3.map(lambda x: ((x[0]+', '+x[1],float(x[2][:-1])))).reduceByKey(lambda a,b:a+b).coalesce(1)

result.saveAsTextFile("hdfs://localhost:9000/Test1")

推荐答案

我认为您唯一的问题是必须将结果重新格式化后再保存到文件中，例如:

I think your only problem is that you have to reformat you result before saving it to the file, i.e. something like:

result.map(lambda x:x[0]+','+str(x[1])).saveAsTextFile("hdfs://localhost:9000/Test1")

这篇关于从python输出中删除字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从python输出中删除字符 [英] Removing Characters from python Output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从python输出中删除字符 [英] Removing Characters from python Output

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭