gsub中的无效正则表达式 [英] invalid regular expression in gsub

查看:92
本文介绍了gsub中的无效正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么电子邮件 regex 给出 error 无效的正则表达式'^ [a-zA-Z0-9 _.+-] + @ [a-zA-Z0-9-] + \.[a-zA-Z0-9-.] + $',原因为无效字符范围"

Why is the email regex giving an error of invalid regular expression '^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$', reason 'Invalid character range'

blogs.smpl <- "mail:mami@yahoo.com: subject:Lorem Ipsum body:   is simply dummy text of the printing and typesetting industry. 
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s"

blogs.smpl <- gsub("^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+$","",blogs.smpl)

推荐答案

因为-仅应位于字符类的开头或结尾.否则,它表示该符号之前和之后的范围.

Because - should only be at the start or end of a character class. Otherwise, it means a ranges between the symbol before it, and after it.

最后一个字符类别有误: [a-zA-Z0-9-.] .它必须转到 [a-zA-Z0-9 .-] .

Last character class is faulty: [a-zA-Z0-9-.]. It must be turned to [a-zA-Z0-9.-].

注意:在R中,除非使用 perl = TRUE ,否则不能在字符类中转义连字符以匹配文字连字符.

NOTE: In R, you cannot escape a hyphen inside a character class to match a literal hyphen, unless you use perl=TRUE.

此外,请参见 R字符串操作 PDF,以获取有关R字符类(第2页)和常规正则表达式的更多信息.这是摘录:

Also, see the R String Manipulation PDF for more information on R character classes (Page 2) and regexes in general. Here is an excerpt:

这是关于如何常规匹配字符的一组规则字符类内的字符:匹配字符内的] 班上把它放在第一位.

要匹配字符类中的-,请将其放在第一位还是最后一个.

To match - inside a character class put it first or last.

要匹配字符类中的 ^ ,请将其放置在任何位置,但要先放置.

To match ^ inside a character class put it anywhere, but first.

匹配A中的任何其他字符或元字符(但 \ )角色类放在任何地方.

To match any other character or metacharacter (but \) inside a character class put it anywhere.

这篇关于gsub中的无效正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆