我如何在R中使用带`grep`的反向引用? [英] How can I use back references with `grep` in R?
问题描述
假设我想找到以a开头的字符串月份名称:
x < - c(2011年5月1日,2011年6月30日)
grep(May | ^ June,x,value = TRUE)
[1]2011年5月1日
这很有效,但我真的想隔离这个月份(即May,而不是整个匹配的字符串。
)可以使用 gsub
来使用替换
参数返回返回引用,但这有两个问题:
- 您必须将模式包装在。*(pattern)。*)中,以便替换发生在整个字符串中。
-
gsub
返回原始字符串,而不是为不匹配的字符串返回NA。这显然不是我想要的:
代码和结果:
gsub(。*(^ May | ^ June)。,\\ 1,x)
[1]May2011年6月30日
我可能通过做各种附加检查来编写解决方法,但这很快就变得非常混乱
要清楚,理想的结果应该是:
pre $ code > [1]MayNA
有没有简单的方法来实现这个目标?
$ b
library(stringr)
x < )
str_extract(x,May | ^ June)
#[1]May不适用六月
I am looking for an elegant way of returning back references using regular expressions in R. Le me explain: Let's say I want to find strings that start with a month name: This works, but I really want to isolate the month (i.e. "May", not the entire matched string. So, one can use The code and results: I could probably code a workaround by doing all kinds of additional checks, but this quickly becomes very messy. To be crystal clear, the desired results should be: Is there an easy way of achieving this? The It's a fairly thin wrapper around 这篇关于我如何在R中使用带`grep`的反向引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! stringr
通常是字符串处理作为米更容易矿石与基准R函数一致。 x <- c("May, 1, 2011", "30 June 2011")
grep("May|^June", x, value=TRUE)
[1] "May, 1, 2011"
gsub
to return the back reference using the substitute
parameter. But this has two problems:
gsub
returns the original string. This is clearly not what I desire:gsub(".*(^May|^June).*", "\\1", x)
[1] "May" "30 June 2011"
[1] "May" NA
stringr
package has a function exactly for this purpose:library(stringr)
x <- c("May, 1, 2011", "30 June 2011", "June 2012")
str_extract(x, "May|^June")
# [1] "May" NA "June"
regexpr
, but stringr
generally makes string handling easier by being more consistent than base R functions.