Spark DataFrame 和重命名多列(Java) [英] Spark DataFrame and renaming multiple columns (Java)
问题描述
是否有比多次调用 dataFrame.withColumnRenamed()
更好的方法来为给定 SparkSQL DataFrame
的所有或多个列同时添加前缀或重命名?
Is there any nicer way to prefix or rename all or multiple columns at the same time of a given SparkSQL DataFrame
than calling multiple times dataFrame.withColumnRenamed()
?
一个例子是,如果我想检测变化(使用全外连接).然后我剩下两个具有相同结构的 DataFrame
.
An example would be if I want to detect changes (using full outer join). Then I'm left with two DataFrame
s with the same structure.
推荐答案
我建议使用 select() 方法来执行此操作.实际上 withColumnRenamed() 方法本身使用 select() .以下是如何重命名多列的示例:
I suggest to use the select() method to perform this. In fact withColumnRenamed() method uses select() by itself. Here is example how to rename multiple columns:
import org.apache.spark.sql.functions._
val someDataframe: DataFrame = ...
val initialColumnNames = Seq("a", "b", "c")
val renamedColumns = initialColumnNames.map(name => col(name).as(s"renamed_$name"))
someDataframe.select(renamedColumns : _*)
这篇关于Spark DataFrame 和重命名多列(Java)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!