正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符 [英] regex to remove words if it contains a letter/special character multiple times simultaneously in R
本文介绍了正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想删除单词中的字母/特殊字符同时出现两次以上的单词.
I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously.
例如,输入就像
"Google in theee lland of whhhat c#, c++ and e###"
,输出应为
"Google in lland of c#, c++ and"
推荐答案
x <- "Google in theee lland of whhhat c#, c++ and e###"
gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x)
# [1] "Google in lland of c#, c++ and "
(\\S)\\1\\1
查找单个非空格字符的三个连续重复的序列.
(\\S)\\1\\1
finds sequences of three consecutive repetitions of a single non-space character.
周围的\\S*
和\\S*\\s?
仅捕获同一单词中的前后字符,以及该单词后的任何单个空格.
The surrounding \\S*
and \\S*\\s?
just capture preceding and succeeding characters within the same word, as well as any single space immediately following the word.
这篇关于正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文