将空数组转换为空pyspark [英] convert empty array to null pyspark
本文介绍了将空数组转换为空pyspark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个pyspark数据框:
I have a pyspark Dataframe:
数据框示例:
id | column_1 | column_2 | column_3
--------------------------------------------
1 | ["12"] | ["""] | ["67"]
--------------------------------------------
2 | ["""] | ["78"] | ["90"]
--------------------------------------------
3 | ["""] | ["93"] | ["56"]
--------------------------------------------
4 | ["100"] | ["78"] | ["90"]
--------------------------------------------
我想转换列的所有值["""]
:column_1, column_2, column_3
to null
.这3列的类型是Array
.
I want to convert all the values ["""]
of the columns: column_1, column_2, column_3
to null
. types of these 3 columns is an Array
.
期望结果:
id | column_1 | column_2 | column_3
--------------------------------------------
1 | ["12"] | null | ["67"]
--------------------------------------------
2 | null | ["78"] | ["90"]
--------------------------------------------
3 | null | ["93"] | ["56"]
--------------------------------------------
4 | ["100"] | ["78"] | ["90"]
--------------------------------------------
我尝试了以下解决方案:
I tried this solution bellow:
df = df.withColumn(
"column_1",
F.when((F.size(F.col("column_1")) == ""),
F.lit(None)).otherwise(F.col("column_1"))
).withColumn(
"column_2",
F.when((F.size(F.col("column_2")) == ""),
F.lit(None)).otherwise(F.col("column_2"))
).withColumn(
"column_3",
F.when((F.size(F.col("column_3")) == ""),
F.lit(None)).otherwise(F.col("column_3"))
)
但是它将全部转换为null. 如何测试通常包含空字符串[["]而不是[]的空数组. 谢谢
But it convert all to null. How can I test on an empty array that contain an empty String normally, [""] not []. Thank you
推荐答案
您可以使用when
进行测试并替换值:
you can test with a when
and replace the values:
df.withColumn(
"column_1",
F.when(F.col("column_1") != F.array(F.lit('"')), # or '"""' ?
F.col("column_1")
))
对您的每一列进行此操作.
Do that for each of your columns.
这篇关于将空数组转换为空pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文