PANDAS 从列中找到确切的给定字符串/单词 [英] PANDAS find exact given string/word from a column

查看：25 发布时间：2021/9/6 19:40:38 python pandas text-mining

本文介绍了PANDAS 从列中找到确切的给定字符串/单词的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

因此，我有一个 Pandas 列名称 Notes，其中包含某个事件的句子或解释.我正在尝试从该列中找到一些给定的单词，当我找到该单词时，我将其添加到下一列中作为 Type

So, I have a pandas column name Notes which contains a sentence or explanation of some event. I am trying find some given words from that column and when I find that word I am adding that to the next column as Type

问题是针对某些特定单词的，例如 Liar、Lies 其选择的单词如 familiar 和 families> 因为他们都有说谎者和谎言.

The problem is for some specific word for example Liar, Lies its picking up word like familiar and families because they both have liar and lies in them.

Notes                                  Type
2 families are living in the address   Lies
He is a liar                           Liar
We are not familiar with this          Liar

从上面可以看出，只有第二句话是正确的.我如何只选择像骗子、谎言这样的单独词，而不是家庭或熟悉的词.

As you can see from above only the second sentence is correct. How do I only pick up separate word like liar, lies and not families or familiar.

这是我的方法，

word= ["Lies"]

for i in range(0, len(df)):
    for f in word:
        if f in df["Notes"][i]:
            df["Type"][i] = "Lies"

感谢任何帮助.谢谢

推荐答案

在regex和.str.extract中使用\b作为词边界代码>查找模式:

Use \b for word boundary in regex, and .str.extract to find pattern:

 df.Notes.str.extract(r'\b(lies|liar)\b')

要标记包含该单词的行，请执行以下操作:

To label those rows containing that word, do:

df['Type'] = np.where(df.Notes.str.contains(r'\b(lies|liar)\b'), 'Lies', 'Not Lies')

这篇关于PANDAS 从列中找到确切的给定字符串/单词的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

PANDAS 从列中找到确切的给定字符串/单词 [英] PANDAS find exact given string/word from a column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

PANDAS 从列中找到确切的给定字符串/单词 [英] PANDAS find exact given string/word from a column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭