gsub中的无效正则表达式 [英] invalid regular expression in gsub
问题描述
为什么电子邮件 error
无效的正则表达式'^ [a-zA-Z0-9 _.+-] + @ [a-zA-Z0-9-] + \.[a-zA-Z0-9-.] + $',原因为无效字符范围"
Why is the email regex
giving an error
of invalid regular expression '^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$', reason 'Invalid character range'
blogs.smpl <- "mail:mami@yahoo.com: subject:Lorem Ipsum body: is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"
blogs.smpl <- gsub("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$","",blogs.smpl)
推荐答案
因为-
仅应位于字符类的开头或结尾.否则,它表示该符号之前和之后的范围.
Because -
should only be at the start or end of a character class. Otherwise, it means a ranges between the symbol before it, and after it.
最后一个字符类别有误: [a-zA-Z0-9-.]
.它必须转到 [a-zA-Z0-9 .-]
.
Last character class is faulty: [a-zA-Z0-9-.]
. It must be turned to [a-zA-Z0-9.-]
.
注意:在R中,除非使用 perl = TRUE
,否则不能在字符类中转义连字符以匹配文字连字符.
NOTE: In R, you cannot escape a hyphen inside a character class to match a literal hyphen, unless you use perl=TRUE
.
此外,请参见 R字符串操作 PDF,以获取有关R字符类(第2页)和常规正则表达式的更多信息.这是摘录:
Also, see the R String Manipulation PDF for more information on R character classes (Page 2) and regexes in general. Here is an excerpt:
这是关于如何常规匹配字符的一组规则字符类内的字符:匹配字符内的
]
班上把它放在第一位.
要匹配字符类中的-
,请将其放在第一位还是最后一个.
To match -
inside a character class put it first
or last.
要匹配字符类中的 ^
,请将其放置在任何位置,但要先放置.
To match ^
inside a character class put it anywhere, but first.
匹配A中的任何其他字符或元字符(但 \
)角色类放在任何地方.
To match any other character or metacharacter (but \
) inside a
character class put it anywhere.
这篇关于gsub中的无效正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!