如何在 pandas 数据框中使用带有多个表达式的str.contains()? [英] How to use str.contains() with multiple expressions, in pandas dataframes?

查看:874
本文介绍了如何在 pandas 数据框中使用带有多个表达式的str.contains()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有更有效的方法来使用Pandas中的str.contains()函数来一次搜索两个部分字符串.我想在数据框中的给定列中搜索包含"nt"或"nv"的数据.现在,我的代码如下:

I'm wondering if there is a more efficient way to use the str.contains() function in Pandas, to search for two partial strings at once. I want to search a given column in a dataframe for data that contains either "nt" or "nv". Right now, my code looks like this:

    df[df['Behavior'].str.contains("nt", na=False)]
    df[df['Behavior'].str.contains("nv", na=False)]

然后我将一个结果附加到另一个.我想做的是使用一行代码来搜索包含"nt"或"nv"或"nf"的任何数据.我尝试了一些我认为应该可行的方法,包括仅在术语之间插入一条管道,但是所有这些都会导致错误.我已经检查了文档,但是我不认为这是一个选择.我收到这样的错误:

And then I append one result to another. What I'd like to do is use a single line of code to search for any data that includes "nt" OR "nv" OR "nf." I've played around with some ways that I thought should work, including just sticking a pipe between terms, but all of these result in errors. I've checked the documentation, but I don't see this as an option. I get errors like this:

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    <ipython-input-113-1d11e906812c> in <module>()
    3 
    4 
    ----> 5 soctol = f_recs[f_recs['Behavior'].str.contains("nt"|"nv", na=False)]
    6 soctol

    TypeError: unsupported operand type(s) for |: 'str' and 'str'

有没有一种快速的方法来做到这一点?感谢您的帮助,我是初学者,但是喜欢熊猫,以应对数据问题.

Is there a fast way to do this? Thanks for any help, I am a beginner but am LOVING pandas for data wrangling.

推荐答案

这是一个正则表达式,应在一个字符串中:

The is one regular expression and should be in one string:

"nt|nv"  # rather than "nt" | " nv"
f_recs[f_recs['Behavior'].str.contains("nt|nv", na=False)]

Python不允许您在字符串上使用or(|)运算符:

Python doesn't let you use the or (|) operator on strings:

In [1]: "nt" | "nv"
TypeError: unsupported operand type(s) for |: 'str' and 'str'

这篇关于如何在 pandas 数据框中使用带有多个表达式的str.contains()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆