PySpark中的比较运算符(不等于/！=) [英] Comparison operator in PySpark (not equal/ !=)

查看：428 发布时间：2020/9/4 3:13:01 sql apache-spark pyspark null apache-spark-sql

本文介绍了PySpark中的比较运算符(不等于/！=)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图获取一个数据帧中的所有行，其中两个标志设置为"1"，然后所有那些仅将两个标志之一设置为"1"而另一个不相等到"1"

I am trying to obtain all rows in a dataframe where two flags are set to '1' and subsequently all those that where only one of two is set to '1' and the other NOT EQUAL to '1'

具有以下架构(三列)

df = sqlContext.createDataFrame([('a',1,'null'),('b',1,1),('c',1,'null'),('d','null',1),('e',1,1)], #,('f',1,'NaN'),('g','bla',1)],
                            schema=('id', 'foo', 'bar')
                            )

我获得以下数据框:

+---+----+----+
| id| foo| bar|
+---+----+----+
|  a|   1|null|
|  b|   1|   1|
|  c|   1|null|
|  d|null|   1|
|  e|   1|   1|
+---+----+----+

当我应用所需的过滤器时，第一个过滤器(foo = 1 AND bar = 1)有效，而其他过滤器(foo = 1 AND NOT bar = 1)无效

When I apply the desired filters, the first filter (foo=1 AND bar=1) works, but not the other (foo=1 AND NOT bar=1)

foobar_df = df.filter( (df.foo==1) & (df.bar==1) )

产量:

+---+---+---+
| id|foo|bar|
+---+---+---+
|  b|  1|  1|
|  e|  1|  1|
+---+---+---+

以下是非行为过滤器:

foo_df = df.filter( (df.foo==1) & (df.bar!=1) )
foo_df.show()
+---+---+---+
| id|foo|bar|
+---+---+---+
+---+---+---+

为什么不过滤?如何获得只有foo等于'1'的列?

Why is it not filtering? How can I get the columns where only foo is equal to '1'?

PySpark中的比较运算符(不等于/！=) [英] Comparison operator in PySpark (not equal/ !=)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

PySpark中的比较运算符(不等于/！=) [英] Comparison operator in PySpark (not equal/ !=)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭