R中具有多个捕获组的正则表达式组捕获 [英] Regex group capture in R with multiple capture-groups
问题描述
在R中,是否可以从正则表达式匹配中提取组捕获?据我所知, grep
, grepl
, regexpr $ c都没有$ c>,
gregexpr
, sub
或 gsub
返回
In R, is it possible to extract group capture from a regular expression match? As far as I can tell, none of grep
, grepl
, regexpr
, gregexpr
, sub
, or gsub
return the group captures.
我需要从经过编码的字符串中提取键值对:
I need to extract key-value pairs from strings that are encoded thus:
\((.*?) :: (0\.[0-9]+)\)
我总是可以做多个完全匹配的抓取,或者做一些外部(非R)处理,但是我希望我可以在R中做所有事情。是否有一个功能或提供此类功能的软件包?
I can always just do multiple full-match greps, or do some outside (non-R) processing, but I was hoping I can do it all within R. Is there's a function or a package that provides such a function to do this?
推荐答案
str_match()
,来自 stringr
软件包将执行此操作。它返回一个字符矩阵,在比赛中,每个组都有一列(整个比赛中有一列):
str_match()
, from the stringr
package, will do this. It returns a character matrix with one column for each group in the match (and one for the whole match):
> s = c("(sometext :: 0.1231313213)", "(moretext :: 0.111222)")
> str_match(s, "\\((.*?) :: (0\\.[0-9]+)\\)")
[,1] [,2] [,3]
[1,] "(sometext :: 0.1231313213)" "sometext" "0.1231313213"
[2,] "(moretext :: 0.111222)" "moretext" "0.111222"
这篇关于R中具有多个捕获组的正则表达式组捕获的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!