Spark Dataframe 的分区数 [英] Number of Partitions of Spark Dataframe

查看：54 发布时间：2021/11/14 22:18:40 apache-spark dataframe apache-spark-sql

本文介绍了Spark Dataframe 的分区数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

谁能解释一下将为 Spark Dataframe 创建的分区数量.

Can anyone explain about the number of partitions that will be created for a Spark Dataframe.

我知道对于 RDD，在创建它时，我们可以提及分区的数量，如下所示.

I know that for a RDD, while creating it we can mention the number of partitions like below.

val RDD1 = sc.textFile("path" , 6)

但是对于创建时的 Spark 数据帧，我们似乎无法像 RDD 那样指定分区数.

But for Spark dataframe while creating looks like we do not have option to specify number of partitions like for RDD.

我认为唯一的可能性是，在创建数据帧后，我们可以使用重新分区 API.

Only possibility i think is, after creating dataframe we can use repartition API.

df.repartition(4)

那么任何人都可以告诉我是否可以在创建数据帧时指定分区数.

So can anyone please let me know if we can specify the number of partitions while creating a dataframe.