如何计算单个groupBy中的总和和计数? [英] How to calculate sum and count in a single groupBy?

查看：32 发布时间：2021/11/14 21:50:56 scala apache-spark apache-spark-sql

本文介绍了如何计算单个groupBy中的总和和计数?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

基于以下DataFrame:

val client = Seq((1,"A",10),(2,"A",5),(3,"B",56)).toDF("ID","Categ","Amnt")
+---+-----+----+
| ID|Categ|Amnt|
+---+-----+----+
|  1|    A|  10|
|  2|    A|   5|
|  3|    B|  56|
+---+-----+----+

我想按类别获取ID的数量和总金额:

I would like to to obtain the number of ID and the total amount by category:

+-----+-----+---------+
|Categ|count|sum(Amnt)|
+-----+-----+---------+
|    B|    1|       56|
|    A|    2|       15|
+-----+-----+---------+

是否可以在不进行连接的情况下进行计数和求和?

Is it possible to do the count and the sum without having to do a join?

client.groupBy("Categ").count
      .join(client.withColumnRenamed("Categ","cat")
           .groupBy("cat")
           .sum("Amnt"), 'Categ === 'cat)
      .drop("cat")

也许是这样的:

client.createOrReplaceTempView("client")
spark.sql("SELECT Categ count(Categ) sum(Amnt) FROM client GROUP BY Categ").show()

推荐答案

我举的例子和你的不一样

I'm giving different example than yours

像这样可以实现多个组函数.相应地尝试

  // In 1.3.x, in order for the grouping column "department" to show up,
// it must be included explicitly as part of the agg function call.
df.groupBy("department").agg($"department", max("age"), sum("expense"))

// In 1.4+, grouping column "department" is included automatically.
df.groupBy("department").agg(max("age"), sum("expense"))

<小时>

import org.apache.spark.sql.{DataFrame, SparkSession}
import org.apache.spark.sql.functions._

val spark: SparkSession = SparkSession
      .builder.master("local")
      .appName("MyGroup")
      .getOrCreate()
import spark.implicits._
    val client: DataFrame = spark.sparkContext.parallelize(
Seq((1,"A",10),(2,"A",5),(3,"B",56))
).toDF("ID","Categ","Amnt")

client.groupBy("Categ").agg(sum("Amnt"),count("ID")).show()

<小时>

+-----+---------+---------+
|Categ|sum(Amnt)|count(ID)|
+-----+---------+---------+
|    B|       56|        1|
|    A|       15|        2|
+-----+---------+---------+

这篇关于如何计算单个groupBy中的总和和计数?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何计算单个groupBy中的总和和计数? [英] How to calculate sum and count in a single groupBy?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何计算单个groupBy中的总和和计数? [英] How to calculate sum and count in a single groupBy?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭