在spark数据帧中减去两列为空的列 [英] subtract two columns with null in spark dataframe

查看:74
本文介绍了在spark数据帧中减去两列为空的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是火花新手,我拥有数据框df:

I new to spark, I have dataframe df:

+----------+------------+-----------+
| Column1  | Column2    | Sub       |                          
+----------+------------+-----------+
| 1        | 2          | 1         |                                         
+----------+------------+-----------+
| 4        | null       | null      |                          
+----------+------------+-----------+
| 5        | null       | null      |                          
+----------+------------+-----------+
| 6        | 8          | 2         |                          
+----------+------------+-----------+

当减去两列时,一列为空,因此结果列也为空.

when subtracting two columns, one column has null so resulting column also resulting as null.

df.withColumn("Sub", col(A)-col(B))

预期输出应为:

+----------+------------+-----------+
|  Column1 | Column2    | Sub       |                          
+----------+------------+-----------+
| 1        | 2          | 1         |                                           
+----------+------------+-----------+
| 4        | null       | 4         |                          
+----------+------------+-----------+
| 5        | null       | 5         |                          
+----------+------------+-----------+
| 6        | 8          | 2         |                          
+----------+------------+-----------+

我不想将column2替换为0,它只能为null. 有人可以帮我吗?

I don't want to replace the column2 to replace with 0, it should be null only. Can someone help me on this?

推荐答案

您可以将when函数用作

import org.apache.spark.sql.functions._
df.withColumn("Sub", when(col("Column1").isNull, lit(0)).otherwise(col("Column1")) - when(col("Column2").isNull, lit(0)).otherwise(col("Column2")))

您的最终结果应为

+-------+-------+----+
|Column1|Column2| Sub|
+-------+-------+----+
|      1|      2|-1.0|
|      4|   null| 4.0|
|      5|   null| 5.0|
|      6|      8|-2.0|
+-------+-------+----+

这篇关于在spark数据帧中减去两列为空的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆