火花数据帧中过滤的多种条件 [英] multiple conditions for filter in spark data frames
本文介绍了火花数据帧中过滤的多种条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含四个字段的数据框.字段名称之一是Status,我正尝试在.filter中为数据框使用OR条件.我尝试了以下查询,但是没有运气.
I have a data frame with four fields. one of the field name is Status and i am trying to use a OR condition in .filter for a dataframe . I tried below queries but no luck.
df2 = df1.filter(("Status=2") || ("Status =3"))
df2 = df1.filter("Status=2" || "Status =3")
以前有人使用过吗?我在此处看到了类似的问题.他们已经在下面的代码中使用OR条件.但是该代码是用于pyspark的.
Has anyone used this before. I have seen a similar question on stack overflow here . They have used below code for using OR condition. But that code is for pyspark.
from pyspark.sql.functions import col
numeric_filtered = df.where(
(col('LOW') != 'null') |
(col('NORMAL') != 'null') |
(col('HIGH') != 'null'))
numeric_filtered.show()
推荐答案
而不是:
df2 = df1.filter("Status=2" || "Status =3")
尝试:
df2 = df1.filter($"Status" === 2 || $"Status" === 3)
这篇关于火花数据帧中过滤的多种条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文