如果将null赋给列,则Spark分配值(python) [英] Spark assign value if null to column (python)
本文介绍了如果将null赋给列,则Spark分配值(python)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有以下数据
+--------------------+-----+--------------------+
| values|count| values2|
+--------------------+-----+--------------------+
| aaaaaa| 249| null|
| bbbbbb| 166| b2|
| cccccc| 1680| something|
+--------------------+-----+--------------------+
因此,如果values2
列中为空值,如何将values1
列分配给它?所以结果应该是:
So if there is a null value in values2
column how to assign the values1
column to it? So the result should be:
+--------------------+-----+--------------------+
| values|count| values2|
+--------------------+-----+--------------------+
| aaaaaa| 249| aaaaaa|
| bbbbbb| 166| b2|
| cccccc| 1680| something|
+--------------------+-----+--------------------+
我想到了以下内容,但它不起作用:
I thought of something of the following but it doesnt work:
df.na.fill({"values2":df['values']}).show()
我找到了解决此问题的方法,但应该有一些更明确的方法:
I found this way to solve it but there should be something more clear forward:
def change_null_values(a,b):
if b:
return b
else:
return a
udf_change_null = udf(change_null_values,StringType())
df.withColumn("values2",udf_change_null("values","values2")).show()
推荐答案
您可以使用 查看全文