r代码删除包含@的单词 [英] r code removing words containing @
问题描述
我想将包含符号@的所有单词替换为特定单词.我使用了gsub,因此将其应用于字符向量.不断出现的问题是当我使用时:
I want to replace all words containing the symbol @ with a specific word. I am used gsub and therefore am applying it to a character vector. The issue that keeps occuring is that when I use:
gsub(".*@.*", "email", data)
该字符向量的该部分中的所有文本都将被删除.
all of the text in that portion of the character vector gets deleted.
有许多不同长度的电子邮件,因此我无法将字符前面和后面的字符设置为特定的数字.
There are multiple different emails all with different lengths so I can't set the characters prior and characters after to a specific number.
有什么建议吗?
我已经完成了很多有关正则表达式的阅读,但是我尝试的所有尝试都失败了.
I've done my fair share of reading about regex but everything I tried failed.
这是一个例子:
data <- c("This is an example. Here is my email: emailaddress@help.com. Thank you")
data <- gsub(".*@.*", "email", data)
它返回 [1]电子邮件"
it returns [1] "email"
当我想要的时候 [1]这是一个例子.这是我的电子邮件:电子邮件.谢谢"
when I want [1] "This is an example. Here is my email: email. Thank you"
推荐答案
您可以使用以下内容.
gsub('\\S+@\\S+', 'email', data)
说明:
\S
与任何非空白字符匹配.因此,这里我们匹配任何非空白字符(1
或更多),后跟@
,然后是任何非空白字符(1
或更多)次)
\S
matches any non-whitespace character. So here we match for any non-whitespace character (1
or more times) preceded by @
followed by any non-whitespace character (1
or more times)
这篇关于r代码删除包含@的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!