如何计算RDD [Long]的标准偏差和平均值? [英] How to calculate standard deviation and average values of RDD[Long]?

查看：137 发布时间：2020/9/4 7:42:50 scala apache-spark apache-spark-sql

本文介绍了如何计算RDD [Long]的标准偏差和平均值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个名为mod的RDD[Long]，我想使用Spark 2.2和Scala 2.11.8计算该RDD的标准偏差和平均值.

I have RDD[Long] called mod and I want to compute standard deviation and mean values for this RDD using Spark 2.2 and Scala 2.11.8.

我该怎么办?

我试图按以下方法计算平均值，但是有没有更简单的方法来获取这些值?

I tried to calculate the average value as follows, but is there any easier way to get these values?

val avg_val = mod.toDF("col").agg(
    avg($"col").as("avg")
).first().toString().toDouble

val stddev_val = mod.toDF("col").agg(
    stddev($"col").as("avg")
).first().toString().toDouble

推荐答案

我有一个称为mod的RDD [Long]，我想计算标准偏差和均值

I have RDD[Long] called mod and I want to compute standard deviation and mean

只需使用stats:

scala> val mod = sc.parallelize(Seq(1L, 3L, 5L))
mod: org.apache.spark.rdd.RDD[Long] = ParallelCollectionRDD[0] at parallelize at <console>:24

scala> val stats = mod.stats
stats: org.apache.spark.util.StatCounter = (count: 3, mean: 3.000000, stdev: 1.632993, max: 5.000000, min: 1.000000)

scala> stats.mean
res0: Double = 3.0

scala> stats.stdev
res1: Double = 1.632993161855452

它使用与stdev和mean相同的内部结构，但只需要扫描一次数据.

It uses the same internals a stdev and mean but has to scan data only once.

对于Dataset，我建议:

val (avg_val, stddev_val) = mod.toDS
  .agg(mean("value"), stddev("value"))
  .as[(Double, Double)].first

或

import org.apache.spark.sql.Row

val Row(avg_val: Double, stddev_val: Double) = mod.toDS
  .agg(mean("value"), stddev("value"))
  .first

但在这里既没有必要，也没有用.

but it neither necessary nor useful here.

这篇关于如何计算RDD [Long]的标准偏差和平均值?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何计算RDD [Long]的标准偏差和平均值? [英] How to calculate standard deviation and average values of RDD[Long]?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何计算RDD [Long]的标准偏差和平均值? [英] How to calculate standard deviation and average values of RDD[Long]?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭