为什么groupBy 200之后的分区数是多少?为什么这200不是其他数字? [英] Why is the number of partitions after groupBy 200? Why is this 200 not some other number?

查看：134 发布时间：2020/9/4 3:17:10 apache-spark

本文介绍了为什么groupBy 200之后的分区数是多少?为什么这200不是其他数字?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是Spark 2.2.0-SNAPSHOT.

It's Spark 2.2.0-SNAPSHOT.

在以下示例中，为什么groupBy转换后的分区数为200?

Why is the number of partitions after groupBy transformation 200 in the following example?

scala> spark.range(5).groupByKey(_ % 5).count.rdd.getNumPartitions
res0: Int = 200

200有什么特别之处?为什么不使用其他数字，例如1024?

What's so special about 200? Why not some other number like 1024?

有人告诉我为什么groupByKey操作总是有200个任务?具体问有关groupByKey的问题，但是问题是关于选择200作为默认值的背后的谜团"，而不是为什么默认存在200个分区的原因.

I've been told about Why does groupByKey operation have always 200 tasks? that asks specifically about groupByKey, but the question is about the "mystery" behind picking 200 as the default not why there are 200 partitions by default.

为什么groupBy 200之后的分区数是多少?为什么这200不是其他数字? [英] Why is the number of partitions after groupBy 200? Why is this 200 not some other number?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么groupBy 200之后的分区数是多少?为什么这200不是其他数字? [英] Why is the number of partitions after groupBy 200? Why is this 200 not some other number?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭