python正则表达式去除重复单词 [英] python regular expression to remove repeated words
本文介绍了python正则表达式去除重复单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我是 Python 新手
I am very new a Python
如果有重复的话,我想换句.
I want to change sentence if there are repeated words.
正确
- 例如.这真是太好了"-->这真是太好了"
- 例如.这就是就是"-->这就是"
现在我正在使用这个 reg.但它确实在字母上发生了变化.前任.我的朋友和我很高兴"-->我的朋友和我很高兴"(删除了i"和空格)错误
Right now am I using this reg. but it do all so change on letters. Ex. "My friend and i is happy" --> "My friend and is happy" (it remove the "i" and space) ERROR
text = re.sub(r'(\w+)\1', r'\1', text) #remove duplicated words in row
如何进行相同的更改,但必须检查单词而不是字母?
How can I do the same change but instead of letters it have to check on words?
推荐答案
text = re.sub(r'\b(\w+)( \1\b)+', r'\1', text) #remove duplicated words in row
\b
匹配空字符串,但只在单词的开头或结尾.
The \b
matches the empty string, but only at the beginning or end of a word.
这篇关于python正则表达式去除重复单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文