如何格式化 Spark DataFrame 中的列 [英] How to format a column in Spark DataFrame

查看：37 发布时间：2021/11/14 22:54:38 scala apache-spark dataframe apache-spark-sql

本文介绍了如何格式化 Spark DataFrame 中的列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

像这样有一个 Spark DataFrame 调用 df:

Have a Spark DataFrame call df like this:

+---+---+
| c1| c2|
+---+---+
|  1|  6|
|  2|  7|
|  3|  8|
|  4|  9|
|  5| 10|
|  6| 11|
|  7| 12|
|  8| 13|
|  9| 14|
+---+---+

我想生成一个新的DataFrame来得到c1和c2的分数，结果应该是这样的:

and I want to generate a new DataFrame to get the fraction of c1 and c2, the result should be like this:

+---+---+------+
| c1| c2|    c3|
+---+---+------+
|  9| 14|0.6429|
|  8| 13|0.6154|
|  7| 12|0.5833|
|  6| 11|0.5455|
|  5| 10|0.5000|
|  4|  9|0.4444|
|  3|  8|0.3750|
|  2|  7|0.2857|
|  1|  6|0.1667|
+---+---+------+

但是，当我使用代码时

res.withColumn("c3", col("c1")/col("c2")).orderBy(col("c3").desc).show()

我得到了:

+---+---+-------------------+
| c1| c2|                 c3|
+---+---+-------------------+
|  9| 14| 0.6428571428571429|
|  8| 13| 0.6153846153846154|
|  7| 12| 0.5833333333333334|
|  6| 11| 0.5454545454545454|
|  5| 10|                0.5|
|  4|  9| 0.4444444444444444|
|  3|  8|              0.375|
|  2|  7| 0.2857142857142857|
|  1|  6|0.16666666666666666|
+---+---+-------------------+

如何在不生成另一个 DataFrame 的情况下将 c3 格式化为所需的格式?(我想在一行代码中得到 df 的结果，我该如何实现?)

How to format c3 to the desired format without having to generate another DataFrame? (I want to get the result from df in just one line of code, how can i achieve this?)

如何格式化 Spark DataFrame 中的列 [英] How to format a column in Spark DataFrame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何格式化 Spark DataFrame 中的列 [英] How to format a column in Spark DataFrame

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭