R忽略正则表达式字符串中的字符 [英] R Ignore character within a Regex string
问题描述
我一直在寻找一些 regex
,它们会导致 R
忽略正则表达式字符串中的下一个字符.
I've looked all over for some regex
that will cause R
to disregard the next character within a regular expression string.
例如,给定 myvector
:
myvector <- c("abcdef", "ghijkl", "mnopqrs")
和 regex
字符串:
regexstring <- "[a-z]{3}XXXXXXXXX "
其中包含一些未知字符XXXXXXXXX,我想告诉 R
忽略正则表达式字符串本身中的最后一个空格.
which includes some unknown characters XXXXXXXXX, I want to tell R
to ignore the final space in the regular expression string itself.
运行以下命令后,
regexstring <- "[a-z]{3} "
sub(regexstring, " ", myvector)
给予
"abcdef" "ghijkl" "mnopqrs"
因为任何字符串中都没有空格.但希望在包含XXXXXXXXX之后,我将获得与运行时相同的输出
because there are no spaces in any of the strings. But hopefully after including XXXXXXXXX I will get the same output as if I had run
regexstring <- "[a-z]{3}"
sub(regexstring, " ", myvector)
这是:
" def" " jkl" " pqrs"
我无法擦除最后一个空格或使用trimws()等,而且我看不到可以使R忽略最后一个空格的方法.是否有任何XXXXXXXXX可以这样做?谢谢.
I can't erase the final space or use trimws(), etc, and I don't see a way I can make R disregard the final space. Is there any XXXXXXXXX that does this? Thanks.
推荐答案
The final space may be made a formatting space by using a (?x)
free-spacing inline modifier in place of XXX
s, and pass the perl=TRUE
argument to make sure the pattern is parsed with the PCRE regex engine.
myvector <- c("abcdef", "ghijkl", "mnopqrs")
regexstring <- "[a-z]{3}(?x) "
sub(regexstring, " ", myvector, perl=TRUE)
## => [1] " def" " jkl" " pqrs"
请参见 R演示.
请注意,将(?x)
放在模式的中间会影响在之后(右侧)使用的所有 literal 空白),直到该模式的结尾或(?-x)
修饰符选项.
Note that placing (?x)
in the middle of the pattern will affect any literal whitespace that is used after (to the right) of the location in the pattern, either until the end of the pattern, or until the (?-x)
modifier option.
这篇关于R忽略正则表达式字符串中的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!