R:gsub和捕获 [英] R: gsub and capture
问题描述
我正在尝试从字符串中提取方括号之间的内容:
I am trying to extract the contents between square brackets from a string:
eq <- "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"
我可以过滤掉它们:
gsub("\\[.+?\\]","" ,eq) ##replaces square brackets and everything inside it
[1] "(5) h + nadh + q10 --> (4) h + nad + q10h2"
但是如何捕捉括号内的内容?我尝试了以下方法:
But how can I capture what's inside the brackets? I tried the following:
gsub("\\[(.+)?\\])", "\\1", eq)
grep("\\[(.+)?\\]", eq, value=TRUE)
但都返回了整个字符串:
but both return me the whole string:
[1] "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"
此外,在我的应用程序中,我永远不知道在方括号中会出现多少个这样的术语,因此我不知道gsub中的替换"参数应如何显示(例如\\1
或\\1_\\2
).
预先感谢!
Also, in my application I never know how many such terms in square brackets occur, so I wouldn't know how the 'replace' argument in gsub should look like (e.g. \\1
or \\1_\\2
).
Thanks in advance!
推荐答案
尝试一下:
eq <- "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"
pattern<-"\\[.+?\\]"
m <- gregexpr(pattern, eq)
regmatches(eq, m)
[[1]]
[1] "[m]" "[m]" "[m]" "[c]" "[m]" "[m]"
您的第一个模式不起作用,因为从未发现过一个额外的括号:
Your first pattern didn't work because of an extra bracket that was never found:
gsub("\\[(.+)?\\])", "\\1", eq) # Yours
gsub("\\[(.+?)\\]", "\\1", eq) # Corrected -- kind of
[1] "(5) hm + nadhm + q10m --> (4) hc + nadm + q10h2m"
本质上,您正在做的事情是将比赛的每个实例替换为第一个带括号的部分,这不是您想要的.
What you essentially are doing is replacing every instance of your match with your first bracketed part, which isn't what you want.
使用grep
,您的第二个模式只需在字符串中搜索模式,找到它,然后返回所有具有该模式的字符串,这就是您的一个字符串.
Your second pattern, using grep
, simply searched the string for the pattern, found it, and then returned all strings that had the pattern, which was your one string.
这篇关于R:gsub和捕获的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!