repartition()不影响RDD分区大小 [英] repartition() is not affecting RDD partition size

查看：122 发布时间：2021/4/8 19:34:05 apache-spark rdd

本文介绍了repartition()不影响RDD分区大小的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用repartition()方法更改RDD的分区大小.RDD上的方法调用成功，但是当我使用RDD的partition.size属性显式检查分区大小时，我得到的分区数量与它最初拥有的分区数量相同:-

I am trying to change partition size of an RDD using repartition() method. The method call on the RDD succeeds, but when I explicitly check the partition size using partition.size property of the RDD, I get back the same number of partitions that it originally had:-

scala> rdd.partitions.size
res56: Int = 50

scala> rdd.repartition(10)
res57: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[19] at repartition at <console>:27

在这个阶段，我执行诸如rdd.take(1)之类的操作只是为了强制评估，以防万一.然后我再次检查分区大小:-

At this stage I perform some action like rdd.take(1) just to force evaluation, just in case if that matters. And then I again check the partition size:-

scala> rdd.partitions.size
res58: Int = 50

正如人们所看到的，它没有改变.有人可以回答为什么吗?

As one can see, it's not changing. Can someone answer why?

repartition()不影响RDD分区大小 [英] repartition() is not affecting RDD partition size

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

repartition()不影响RDD分区大小 [英] repartition() is not affecting RDD partition size

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭