R忽略正则表达式字符串中的字符 [英] R Ignore character within a Regex string

查看:60
本文介绍了R忽略正则表达式字符串中的字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在寻找一些 regex ,它们会导致 R 忽略正则表达式字符串中的下一个字符.

I've looked all over for some regex that will cause R to disregard the next character within a regular expression string.

例如,给定 myvector :

 myvector <- c("abcdef", "ghijkl", "mnopqrs")

regex 字符串:

 regexstring <- "[a-z]{3}XXXXXXXXX "

其中包含一些未知字符XXXXXXXXX,我想告诉 R 忽略正则表达式字符串本身中的最后一个空格.

which includes some unknown characters XXXXXXXXX, I want to tell R to ignore the final space in the regular expression string itself.

运行以下命令后,

regexstring <- "[a-z]{3} "
sub(regexstring, " ", myvector)

给予

"abcdef"  "ghijkl"  "mnopqrs"

因为任何字符串中都没有空格.但希望在包含XXXXXXXXX之后,我将获得与运行时相同的输出

because there are no spaces in any of the strings. But hopefully after including XXXXXXXXX I will get the same output as if I had run

regexstring <- "[a-z]{3}"
sub(regexstring, " ", myvector)

这是:

 " def"  " jkl"  " pqrs"

我无法擦除最后一个空格或使用trimws()等,而且我看不到可以使R忽略最后一个空格的方法.是否有任何XXXXXXXXX可以这样做?谢谢.

I can't erase the final space or use trimws(), etc, and I don't see a way I can make R disregard the final space. Is there any XXXXXXXXX that does this? Thanks.

推荐答案

通过使用

The final space may be made a formatting space by using a (?x) free-spacing inline modifier in place of XXXs, and pass the perl=TRUE argument to make sure the pattern is parsed with the PCRE regex engine.

myvector <- c("abcdef", "ghijkl", "mnopqrs")
regexstring <- "[a-z]{3}(?x) "
sub(regexstring, " ", myvector, perl=TRUE) 
## => [1] " def"  " jkl"  " pqrs"

请参见 R演示.

请注意,将(?x)放在模式的中间会影响在之后(右侧)使用的所有 literal 空白),直到该模式的结尾或(?-x)修饰符选项.

Note that placing (?x) in the middle of the pattern will affect any literal whitespace that is used after (to the right) of the location in the pattern, either until the end of the pattern, or until the (?-x) modifier option.

这篇关于R忽略正则表达式字符串中的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆