Spark DataFrame并重命名多个列(Java) [英] Spark DataFrame and renaming multiple columns (Java)
问题描述
在给定的SparkSQL DataFrame
的同时,是否有更好的方法来为所有或多个列添加前缀或重命名,而不是多次调用 dataFrame。 withColumnRenamed()
?
Is there any nicer way to prefix or rename all or multiple columns at the same time of a given SparkSQL DataFrame
than calling multiple times dataFrame.withColumnRenamed()
?
如果我想检测更改(使用完全外连接),则会有一个例子。然后我留下两个具有相同结构的 DataFrame
。
An example would be if I want to detect changes (using full outer join). Then I'm left with two DataFrame
s with the same structure.
推荐答案
我建议使用select()方法来执行此操作。实际上withColumnRenamed()方法本身使用select()。以下是如何重命名多个列的示例:
I suggest to use the select() method to perform this. In fact withColumnRenamed() method uses select() by itself. Here is example how to rename multiple columns:
import org.apache.spark.sql.functions._
val someDataframe: DataFrame = ...
val initialColumnNames = Seq("a", "b", "c")
val renamedColumns = initialColumnNames.map(name => col(name).as(s"renamed_$name"))
someDataframe.select(renamedColumns : _*)
这篇关于Spark DataFrame并重命名多个列(Java)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!