python pandas.Series.str.包含整个单词 [英] python pandas.Series.str.contains WHOLE WORD

查看:104
本文介绍了python pandas.Series.str.包含整个单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

df(熊猫数据框)有三行.

df (Pandas Dataframe) has three rows.

col_name
"This is Donald."
"His hands are so small"
"Why are his fingers so short?"

我想提取包含"is"和"small"的行.

I'd like to extract the row that contains "is" and "small".

如果我愿意

df.col_name.str.contains("is|small", case=False)

然后它也捕获了"His",这是我不想要的.

Then it catches "His" as well- which I don't want.

下面的查询是在df.series中捕获整个单词的正确方法吗?

Is below query is the right way to catch the whole word in df.series?

df.col_name.str.contains("\bis\b|\bsmall\b", case=False)

推荐答案

否,正则表达式/bis/b|/bsmall/b将失败,因为您使用的是/b,而不是表示单词边界的\b.

No, the regex /bis/b|/bsmall/b will fail because you are using /b, not the \b that means "word boundary".

对此进行更改,您将获得一个匹配项.我建议使用

Change that and you get a match. I would recommend using

\b(is|small)\b

至少对我而言,该正则表达式更快,更易读.

That regex is a little faster and a little more legible, at least to me.

这篇关于python pandas.Series.str.包含整个单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆