R中具有多个捕获组的正则表达式组捕获 [英] Regex group capture in R with multiple capture-groups

查看:103
本文介绍了R中具有多个捕获组的正则表达式组捕获的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中,是否可以从正则表达式匹配中提取组捕获?据我所知, grep grepl regexpr gregexpr sub gsub 返回

In R, is it possible to extract group capture from a regular expression match? As far as I can tell, none of grep, grepl, regexpr, gregexpr, sub, or gsub return the group captures.

我需要从经过编码的字符串中提取键值对:

I need to extract key-value pairs from strings that are encoded thus:

\((.*?) :: (0\.[0-9]+)\)

我总是可以做多个完全匹配的抓取,或者做一些外部(非R)处理,但是我希望我可以在R中做所有事情。是否有一个功能或提供此类功能的软件包?

I can always just do multiple full-match greps, or do some outside (non-R) processing, but I was hoping I can do it all within R. Is there's a function or a package that provides such a function to do this?

推荐答案

str_match(),来自 stringr 软件包将执行此操作。它返回一个字符矩阵,在比赛中,每个组都有一列(整个比赛中有一列):

str_match(), from the stringr package, will do this. It returns a character matrix with one column for each group in the match (and one for the whole match):

> s = c("(sometext :: 0.1231313213)", "(moretext :: 0.111222)")
> str_match(s, "\\((.*?) :: (0\\.[0-9]+)\\)")
     [,1]                         [,2]       [,3]          
[1,] "(sometext :: 0.1231313213)" "sometext" "0.1231313213"
[2,] "(moretext :: 0.111222)"     "moretext" "0.111222"    

这篇关于R中具有多个捕获组的正则表达式组捕获的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆