pyspark如何根据另一列的值返回一列的平均值? [英] pyspark how to return the average of a column based on the value of another column?

查看：290 发布时间：2020/9/4 5:16:51 python dataframe apache-spark pyspark

本文介绍了pyspark如何根据另一列的值返回一列的平均值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我不希望这很困难，但是我在理解如何获取Spark数据帧中列的平均值方面遇到困难.

I wouldn't expect this to be difficult, but I'm having trouble understanding how to take the average of a column in my spark dataframe.

数据框如下:

+-------+------------+--------+------------------+
|Private|Applications|Accepted|              Rate|
+-------+------------+--------+------------------+
|    Yes|         417|     349|0.8369304556354916|
|    Yes|        1899|    1720|0.9057398630858347|
|    Yes|        1732|    1425|0.8227482678983834|
|    Yes|         494|     313|0.6336032388663968|
|     No|        3540|    2001|0.5652542372881356|
|     No|        7313|    4664|0.6377683577191303|
|    Yes|         619|     516|0.8336025848142165|
|    Yes|         662|     513|0.7749244712990937|
|    Yes|         761|     725|0.9526938239159002|
|    Yes|        1690|    1366| 0.808284023668639|
|    Yes|        6075|    5349|0.8804938271604938|
|    Yes|         632|     494|0.7816455696202531|
|     No|        1208|     877|0.7259933774834437|
|    Yes|       20192|   13007|0.6441660063391442|
|    Yes|        1436|    1228|0.8551532033426184|
|    Yes|         392|     351|0.8954081632653061|
|    Yes|       12586|    3239|0.2573494358811378|
|    Yes|        1011|     604|0.5974282888229476|
|    Yes|         848|     587|0.6922169811320755|
|    Yes|        8728|    5201|0.5958982584784601|
+-------+------------+--------+------------------+

当Private等于是"时，我想返回Rate列的平均值.我该怎么办?

I want to return the average of the Rate column when Private is equal to "Yes". How can I do this?

pyspark如何根据另一列的值返回一列的平均值? [英] pyspark how to return the average of a column based on the value of another column?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pyspark如何根据另一列的值返回一列的平均值? [英] pyspark how to return the average of a column based on the value of another column?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭