当值与 pyspark 中的字符串的一部分匹配时过滤 df [英] Filter df when values matches part of a string in pyspark

查看：9 发布时间：2021/12/22 21:25:19 python apache-spark pyspark apache-spark-sql

本文介绍了当值与 pyspark 中的字符串的一部分匹配时过滤 df的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个很大的 pyspark.sql.dataframe.DataFrame，我想保留(所以 filter)URL 保存在 location<的所有行/code> 列包含一个预先确定的字符串，例如'google.com'.

我试过了:

import pyspark.sql.functions as sfdf.filter(sf.col('location').contains('google.com')).show(5)

但这会抛出一个

TypeError: _TypeError: 'Column' 对象不可调用'

如何正确过滤我的 df?提前谢谢了！

解决方案

`Spark 2.2 以上`

<块引用>
df.filter(df.location.contains('google.com'))
Spark 2.2 文档链接
<小时>
Spark 2.1 及之前
<块引用>您可以在filter
中使用普通SQL
df.filter("location like '%google.com%'")
或使用 DataFrame 列方法
df.filter(df.location.like('%google.com%'))
Spark 2.1 文档链接
I have a large pyspark.sql.dataframe.DataFrame and I want to keep (so filter) all rows where the URL saved in the location column contains a pre-determined string, e.g. 'google.com'.

I have tried:
import pyspark.sql.functions as sf
df.filter(sf.col('location').contains('google.com')).show(5)
but this throws a 
TypeError: _TypeError: 'Column' object is not callable'
How do I go around and filter my df properly? Many thanks in advance! 
 解决方案 
Spark 2.2 onwards


df.filter(df.location.contains('google.com'))
Spark 2.2 documentation link




Spark 2.1 and before


  You can use plain SQL in filter
df.filter("location like '%google.com%'")
or with DataFrame column methods
df.filter(df.location.like('%google.com%'))
Spark 2.1 documentation link


                        
这篇关于当值与 pyspark 中的字符串的一部分匹配时过滤 df的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

当值与 pyspark 中的字符串的一部分匹配时过滤 df [英] Filter df when values matches part of a string in pyspark

问题描述

`Spark 2.2 以上`

Spark 2.1 及之前

Spark 2.2 onwards

Spark 2.1 and before

相关文章

Python最新文章

热门教程

热门工具

登录关闭

当值与 pyspark 中的字符串的一部分匹配时过滤 df [英] Filter df when values matches part of a string in pyspark

问题描述

Spark 2.2 以上

Spark 2.1 及之前

Spark 2.2 onwards

Spark 2.1 and before

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

`Spark 2.2 以上`

登录关闭