R:gsub和捕获 [英] R: gsub and capture

查看:101
本文介绍了R:gsub和捕获的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从字符串中提取方括号之间的内容:

I am trying to extract the contents between square brackets from a string:

eq <- "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"

我可以过滤掉它们:

gsub("\\[.+?\\]","" ,eq) ##replaces square brackets and everything inside it
   [1] "(5) h + nadh + q10 --> (4) h + nad + q10h2"

但是如何捕捉括号内的内容?我尝试了以下方法:

But how can I capture what's inside the brackets? I tried the following:

gsub("\\[(.+)?\\])", "\\1", eq) 
grep("\\[(.+)?\\]", eq, value=TRUE)

但都返回了整个字符串:

but both return me the whole string:

[1] "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"

此外,在我的应用程序中,我永远不知道在方括号中会出现多少个这样的术语,因此我不知道gsub中的替换"参数应如何显示(例如\\1\\1_\\2). 预先感谢!

Also, in my application I never know how many such terms in square brackets occur, so I wouldn't know how the 'replace' argument in gsub should look like (e.g. \\1 or \\1_\\2). Thanks in advance!

推荐答案

尝试一下:

eq <- "(5) h[m] + nadh[m] + q10[m] --> (4) h[c] + nad[m] + q10h2[m]"
pattern<-"\\[.+?\\]"
m <- gregexpr(pattern, eq)
regmatches(eq, m)
[[1]]
[1] "[m]" "[m]" "[m]" "[c]" "[m]" "[m]"

您的第一个模式不起作用,因为从未发现过一个额外的括号:

Your first pattern didn't work because of an extra bracket that was never found:

gsub("\\[(.+)?\\])", "\\1", eq) # Yours 
gsub("\\[(.+?)\\]", "\\1", eq) # Corrected -- kind of
[1] "(5) hm + nadhm + q10m --> (4) hc + nadm + q10h2m"

本质上,您正在做的事情是将比赛的每个实例替换为第一个带括号的部分,这不是您想要的.

What you essentially are doing is replacing every instance of your match with your first bracketed part, which isn't what you want.

使用grep,您的第二个模式只需在字符串中搜索模式,找到它,然后返回所有具有该模式的字符串,这就是您的一个字符串.

Your second pattern, using grep, simply searched the string for the pattern, found it, and then returned all strings that had the pattern, which was your one string.

这篇关于R:gsub和捕获的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆