如何格式化Spark DataFrame中的列 [英] How to format a column in Spark DataFrame

查看：343 发布时间：2020/9/4 20:04:50 scala apache-spark dataframe apache-spark-sql

本文介绍了如何格式化Spark DataFrame中的列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

具有这样的Spark DataFrame调用df:

Have a Spark DataFrame call df like this:

+---+---+
| c1| c2|
+---+---+
|  1|  6|
|  2|  7|
|  3|  8|
|  4|  9|
|  5| 10|
|  6| 11|
|  7| 12|
|  8| 13|
|  9| 14|
+---+---+

我想生成一个新的DataFrame来获取c1和c2的分数，结果应该是这样的:

and I want to generate a new DataFrame to get the fraction of c1 and c2, the result should be like this:

+---+---+------+
| c1| c2|    c3|
+---+---+------+
|  9| 14|0.6429|
|  8| 13|0.6154|
|  7| 12|0.5833|
|  6| 11|0.5455|
|  5| 10|0.5000|
|  4|  9|0.4444|
|  3|  8|0.3750|
|  2|  7|0.2857|
|  1|  6|0.1667|
+---+---+------+

但是，当我使用代码

res.withColumn("c3", col("c1")/col("c2")).orderBy(col("c3").desc).show()

我得到了:

+---+---+-------------------+
| c1| c2|                 c3|
+---+---+-------------------+
|  9| 14| 0.6428571428571429|
|  8| 13| 0.6153846153846154|
|  7| 12| 0.5833333333333334|
|  6| 11| 0.5454545454545454|
|  5| 10|                0.5|
|  4|  9| 0.4444444444444444|
|  3|  8|              0.375|
|  2|  7| 0.2857142857142857|
|  1|  6|0.16666666666666666|
+---+---+-------------------+

如何在不生成另一个DataFrame的情况下将c3格式化为所需的格式? (我想只用一行代码从df中获得结果，我怎么能做到这一点?)

How to format c3 to the desired format without having to generate another DataFrame? (I want to get the result from df in just one line of code, how can i achieve this?)

如何格式化Spark DataFrame中的列 [英] How to format a column in Spark DataFrame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何格式化Spark DataFrame中的列 [英] How to format a column in Spark DataFrame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭