打包列表以从 spark 数据框中选择多列 [英] Upacking a list to select multiple columns from a spark data frame
问题描述
我有一个 spark 数据框 df
.有没有办法使用这些列的列表来子选择几列?
I have a spark data frame df
. Is there a way of sub selecting a few columns using a list of these columns?
scala> df.columns
res0: Array[String] = Array("a", "b", "c", "d")
我知道我可以做一些类似 df.select("b", "c")
的事情.但是假设我有一个包含几个列名的列表 val cols = List("b", "c")
,有没有办法将它传递给 df.select?df.select(cols)
抛出错误.类似于 python 中的 df.select(*cols)
I know I can do something like df.select("b", "c")
. But suppose I have a list containing a few column names val cols = List("b", "c")
, is there a way to pass this to df.select? df.select(cols)
throws an error. Something like df.select(*cols)
as in python
推荐答案
使用 df.select(cols.head, cols.tail: _*)
让我知道它是否有效:)
Let me know if it works :)
key是select的方法签名:
The key is the method signature of select:
select(col: String, cols: String*)
cols:String*
条目采用可变数量的参数.:_*
解包参数,以便它们可以被这个参数处理.非常类似于在 python 中使用 *args
解包.请参阅此处和此处 其他示例.
The cols:String*
entry takes a variable number of arguments. :_*
unpacks arguments so that they can be handled by this argument. Very similar to unpacking in python with *args
. See here and here for other examples.
这篇关于打包列表以从 spark 数据框中选择多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!