在Python中匹配字符串中的确切短语 [英] Match exact phrase within a string in Python

查看:323
本文介绍了在Python中匹配字符串中的确切短语的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试确定字符串中是否包含子字符串. 我遇到的问题是,如果子字符串在字符串的另一个单词中找到,我不希望函数返回True.

I'm trying to determine whether a substring is in a string. The issue I'm running into is that I don't want my function to return True if the substring is found within another word in the string.

例如:如果子字符串是; 紫牛" 字符串是; 紫色的母牛是最好的宠物." 这应该返回False.由于cow在子串中不是复数.

For example: if the substring is; "Purple cow" and the string is; "Purple cows make the best pets." This should return False. Since cow isn't plural in the substring.

如果子字符串是; 紫牛" 弦是; 你的紫牛践踏了我的树篱!" 会返回True

And if the substring was; "Purple cow" and the string was; "Your purple cow trampled my hedge!" would return True

我的代码如下:

def is_phrase_in(phrase, text):
    phrase = phrase.lower()
    text = text.lower()

    return phrase in text


text = "Purple cows make the best pets!"
phrase = "Purple cow"
print(is_phrase_in(phrase, text)

在我的实际代码中,在将其与短语进行比较之前,我会清除文本"中不必要的标点和空格,但除此之外是相同的. 我已经尝试过使用re.search,但是我对正则表达式的理解还不是很清楚,只能从它们中获得与我的示例相同的功能.

In my actual code I clean up unnecessary punctuation and spaces in 'text' before comparing it to phrase, but otherwise this is the same. I've tried using re.search, but I don't understand regular expressions very well yet and have only gotten the same functionality from them as in my example.

感谢您提供的任何帮助!

Thanks for any help you can provide!

推荐答案

由于您的短语可以包含多个单词,因此无法进行简单的拆分和相交.我会为此使用正则表达式:

Since your phrase can have multiple words, doing a simple split and intersect won't work. I'd go with regex for this one:

import re

def is_phrase_in(phrase, text):
    return re.search(r"\b{}\b".format(phrase), text, re.IGNORECASE) is not None

phrase = "Purple cow"

print(is_phrase_in(phrase, "Purple cows make the best pets!"))   # False
print(is_phrase_in(phrase, "Your purple cow trampled my hedge!"))  # True

这篇关于在Python中匹配字符串中的确切短语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆