regex Lookbehind Lookahead问题 [英] r regex Lookbehind Lookahead issue

查看:75
本文介绍了regex Lookbehind Lookahead问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试提取类似 44.11.36.00-1 的段落(准确地说,是 nn.nn.nn.nn-n ,其中 n 代表R中文本中的0-9之间的任何数字.

I try to extract passages like 44.11.36.00-1 (precisely, nn.nn.nn.nn-n, where n stands for any number from 0-9) from text in R.

如果要粘贴"非数字标记,我想提取段落:

I want to extract passages if they are "sticked" to non-number marks:

    nsfghstighsl44.11.36.00-1vsdfgh 中提取的
  • 44.11.36.00-1 是可以的
  • fa0044.11.36.00-1000 中提取的
  • 44.11.36.00-1 不是
  • 44.11.36.00-1 extracted from nsfghstighsl44.11.36.00-1vsdfgh is OK
  • 44.11.36.00-1 extracted from fa0044.11.36.00-1000 is NOT

我已经了解到 str_extract_all 不适用于 Lookbehind Lookahead 表达式,因此我很遗憾地回到了 grep ,但无法处理:

I have read that str_extract_all is not working with Lookbehind and Lookahead expressions, so I sadly came back to grep, but cannot deal with it:

> pattern1 <- "(?<![0-9]{1})[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1}(?![0-9]{1})"
> grep(pattern1, "dyj44.11.36.00-1aregjspotgji 44113600-1 agdtklj441136001 ", perl=TRUE, value = TRUE)

[1] "dyj44.11.36.00-1aregjspotgji 44113600-1 agdtklj441136001 "

这不是我预期的结果.

我认为:

  • (?<![0-9] {1})表示不以数字开头的匹配表达式"
  • [0-9] {2} \\.[0-9] {2} \\.[0-9] {2} \\.[0-9] {2}-[0-9] {1} 代表我要寻找的表达式
  • (?![0-9] {1})的意思是匹配表达式,其后没有数字"
  • (?<![0-9]{1}) means "match expression which is not preceeded by a number"
  • [0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1} stands for the expression I seek for
  • (?![0-9]{1}) means "match expression which is not followed by a number"

推荐答案

AS @Roland在他的评论中说,您需要使用 regmatches 而不是 grep

AS @Roland said in his comment, you need to use regmatches instead of grep

> s <- "nsfghstighsl44.11.36.00-1vsdfgh"
> m <- gregexpr("(?<![0-9]{1})[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}\\.[0-9]{2}-[0-9]{1}(?![0-9]{1})", s, perl=TRUE)
> regmatches(s, m)
[1] "44.11.36.00-1"

减少了一个,

> x <- c('nsfghstighsl44.11.36.00-1vsdfgh', 'fa0044.11.36.00-1000')
> m <- gregexpr("(?<!\\d)\\d{2}\\.\\d{2}\\.\\d{2}\\.\\d{2}-\\d(?!\\d)", x, perl=TRUE)
> regmatches(x, m)
[1] "44.11.36.00-1"

这篇关于regex Lookbehind Lookahead问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆