regex Lookbehind Lookahead问题 [英] r regex Lookbehind Lookahead issue
本文介绍了regex Lookbehind Lookahead问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我尝试提取类似 44.11.36.00-1
的段落(准确地说,是 nn.nn.nn.nn-n
,其中 n
代表R中文本中的0-9之间的任何数字.
I try to extract passages like 44.11.36.00-1
(precisely, nn.nn.nn.nn-n
, where n
stands for any number from 0-9) from text in R.
如果要粘贴"非数字标记,我想提取段落:
I want to extract passages if they are "sticked" to non-number marks:
- 从
-
44.11.36.00-1
是可以的 从 -
44.11.36.00-1
不是
nsfghstighsl44.11.36.00-1vsdfgh
中提取的 fa0044.11.36.00-1000
中提取的44.11.36.00-1
extracted fromnsfghstighsl44.11.36.00-1vsdfgh
is OK44.11.36.00-1
extracted fromfa0044.11.36.00-1000
is NOT
我已经了解到 str_extract_all
不适用于 Lookbehind
和 Lookahead
表达式,因此我很遗憾地回到了 grep
,但无法处理:
I have read that str_extract_all
is not working with Lookbehind
and Lookahead
expressions, so I sadly came back to grep
, but cannot deal with it:
> pattern1 <- "(?<![0-9]{1})[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1}(?![0-9]{1})"
> grep(pattern1, "dyj44.11.36.00-1aregjspotgji 44113600-1 agdtklj441136001 ", perl=TRUE, value = TRUE)
[1] "dyj44.11.36.00-1aregjspotgji 44113600-1 agdtklj441136001 "
这不是我预期的结果.
我认为:
-
(?<![0-9] {1})
表示不以数字开头的匹配表达式" -
[0-9] {2} \\.[0-9] {2} \\.[0-9] {2} \\.[0-9] {2}-[0-9] {1}
代表我要寻找的表达式 -
(?![0-9] {1})
的意思是匹配表达式,其后没有数字"
(?<![0-9]{1})
means "match expression which is not preceeded by a number"[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1}
stands for the expression I seek for(?![0-9]{1})
means "match expression which is not followed by a number"
推荐答案
AS @Roland在他的评论中说,您需要使用 regmatches
而不是 grep
AS @Roland said in his comment, you need to use regmatches
instead of grep
> s <- "nsfghstighsl44.11.36.00-1vsdfgh"
> m <- gregexpr("(?<![0-9]{1})[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1}(?![0-9]{1})", s, perl=TRUE)
> regmatches(s, m)
[1] "44.11.36.00-1"
减少了一个,
> x <- c('nsfghstighsl44.11.36.00-1vsdfgh', 'fa0044.11.36.00-1000')
> m <- gregexpr("(?<!\\d)\\d{2}\\.\\d{2}\\.\\d{2}\\.\\d{2}-\\d(?!\\d)", x, perl=TRUE)
> regmatches(x, m)
[1] "44.11.36.00-1"
这篇关于regex Lookbehind Lookahead问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文