pyspark flatmat错误:TypeError:'int'对象不可迭代 [英] pyspark flatmat error: TypeError: 'int' object is not iterable

查看:93
本文介绍了pyspark flatmat错误:TypeError:'int'对象不可迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我书中的示例示例代码:

This is the sample example code in my book:

from pyspark import SparkConf, SparkContext

conf = SparkConf().setMaster("spark://chetan-ThinkPad- 
E470:7077").setAppName("FlatMap")
sc = SparkContext(conf=conf)

numbersRDD = sc.parallelize([1, 2, 3, 4])
actionRDD = numbersRDD.flatMap(lambda x: x + x).collect()
for values in actionRDD:
    print(values)

我收到此错误:TypeError:"int"对象不可迭代

I am getting this error: TypeError: 'int' object is not iterable

    at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRDD.scala:193)
    at org.apache.spark.api.python.PythonRunner$$anon$1.<init>(PythonRDD.scala:234)
    at org.apache.spark.api.python.PythonRunner.compute(PythonRDD.scala:152)
    at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:63)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    ... 1 more

推荐答案

您不能在 Int 对象

flatMap 可以在集合对象中使用,例如 Arrays list .

flatMap can be used in collection objects such as Arrays or list.

您可以在具有 RDD [Integer]

numbersRDD = sc.parallelize([1, 2, 3, 4])
actionRDD = numbersRDD.map(lambda x: x + x)

def printing(x):
    print x

actionRDD.foreach(printing)

应打印

2
4
6
8

这篇关于pyspark flatmat错误:TypeError:'int'对象不可迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆