spark.sql.shuffle.partitions到底指的是什么? [英] What does spark.sql.shuffle.partitions exactly refer to?
本文介绍了spark.sql.shuffle.partitions到底指的是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
spark.sql.shuffle.partitions 到底指的是什么?我们是在谈论广泛转换的结果所产生的分区数量,还是在广泛转换的结果分区之前的中间某种中间分区中发生的事情?
What exactly does spark.sql.shuffle.partitions
refer to? Are we talking of the number of partitions that is the results of a wide transformation, or something that happens in the middle as in some sort of intermediary partitioning before the result partition of the wide transformation?
由于我的理解,按照我们的广泛转型
Because in my understanding, as per a wide transformation we have
Parents RDDs -> shuffle files -> Child RDDs
spark.sql.shuffle.partitions参数在这里指的是什么?随机播放文件或儿童RDD 或其他我忽略的内容?
What does the spark.sql.shuffle.partitions parameter refer to here? The shuffles files or the CHILD RDDs or something else that I ignored?
推荐答案
查看全文