正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符 [英] regex to remove words if it contains a letter/special character multiple times simultaneously in R

查看:99
本文介绍了正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想删除单词中的字母/特殊字符同时出现两次以上的单词.

I want to remove those words where the number of letters/special characters in a word occurs more than twice simultaneously.

例如,输入就像

"Google in theee lland of whhhat c#, c++ and e###"

,输出应为

"Google in lland of c#, c++ and"

推荐答案

x <- "Google in theee lland of whhhat c#, c++ and e###"
gsub("\\S*(\\S)\\1\\1\\S*\\s?", "", x)
# [1] "Google in lland of c#, c++ and "

(\\S)\\1\\1查找单个非空格字符的三个连续重复的序列.

(\\S)\\1\\1 finds sequences of three consecutive repetitions of a single non-space character.

周围的\\S*\\S*\\s?仅捕获同一单词中的前后字符,以及该单词后的任何单个空格.

The surrounding \\S* and \\S*\\s? just capture preceding and succeeding characters within the same word, as well as any single space immediately following the word.

这篇关于正则表达式删除单词,如果它同时在R中多次包含字母/特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆