Scala中的行聚合 [英] Row aggregations in Scala
本文介绍了Scala中的行聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在寻找一种在Scala的数据框中获取新列的方法,该方法计算col1
,col2
,...,col10
中的值的min
/max
每一行.
I am looking for a way to get a new column in a data frame in Scala that calculates the min
/max
of the values in col1
, col2
, ..., col10
for each row.
我知道我可以使用UDF做到这一点,但是也许有一种更简单的方法.
I know I can do it with a UDF but maybe there is an easier way.
谢谢!
推荐答案
import org.apache.spark.sql.functions._
val df = Seq(
(1, 3, 0, 9, "a", "b", "c")
).toDF("col1", "col2", "col3", "col4", "col5", "col6", "Col7")
val cols = Seq("col1", "col2", "col3", "col4")
val rowMax = greatest(
cols map col: _*
).alias("max")
val rowMin = least(
cols map col: _*
).alias("min")
df.select($"*", rowMin, rowMax).show
// +----+----+----+----+----+----+----+---+---+
// |col1|col2|col3|col4|col5|col6|Col7|min|max|
// +----+----+----+----+----+----+----+---+---+
// | 1| 3| 0| 9| a| b| c|0.0|9.0|
// +----+----+----+----+----+----+----+---+---+
这篇关于Scala中的行聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文