R:使用正则表达式定义文本范围 [英] R: Define ranges from text using regex

查看:105
本文介绍了R:使用正则表达式定义文本范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一种方法来调用定义的变量,这些变量取决于文本中的字符串. 假设我有五个变量(r010,r020,r030,r040,r050). 如果有给定格式的文本"r010-050",我希望获得所有五个变量的值之和.

I need a way to call defined variables dependant from a string within text. Let's say I have five variables (r010, r020, r030, r040, r050). If there is a given text in that form "r010-050" I want to have the sum of values from all five variables.

整个文本看起来像"{r010-050} == {r060}" 该等式的第一部分需要替换为五个变量的总和,并且由于r060也是一个变量,所以结果(通过解析文本)应该是一个逻辑值.

The whole text would look like "{r010-050} == {r060}" The first part of that equation needs to be replaced by the sum of the five variables and since r060 is also a variable the result (via parsing the text) should be a logical value.

我认为正则表达式将再次为您提供帮助. 有人可以帮忙吗? 谢谢.

I think regex will help here again. Can anyone help? Thanks.

推荐答案

定义输入:变量r010等,我们假设它们是标量和字符串s.

Define the inputs: the variables r010 etc. which we assume are scalars and the string s.

然后定义与{...}部分匹配的模式pat和接受pat中3个捕获组的函数Sum(即,与括号中与pat的部分匹配的字符串)并执行所需的总和.

Then define a pattern pat which matches the {...} part and a function Sum which accepts the 3 capture groups in pat (i.e. the strings matched to the parts of pat within parentheses) and performs the desired sum.

使用gsubfn匹配模式,将捕获组传递到Sum,并将匹配项替换为Sum的输出.然后评估它.

Use gsubfn to match the pattern, passing the capture groups to Sum and replacing the match with the output of Sum. Then evaluate it.

在该示例中,全局环境中唯一名称在r010r050之间的变量是r010r020(如果存在,它将使用更多的变量),并且由于它们的总和为它返回TRUE.

In the example the only variables in the global environment whose names are between r010 and r050 inclusive are r010 and r020 (it would have used more had they existed) and since they sum to r060 it returned TRUE.

library(gsubfn)

# inputs
r010 <- 1; r020 <- 2; r060 <- 3
s <- "{r010-050} == {r060}"

pat <- "[{](\\w+)(-(\\w+))?[}]"
Sum <- function(x1, x2, x3, env = .GlobalEnv) {
  x3 <- if(x3 == "") x1 else paste0(gsub("\\d", "", x1), x3)
  lst <- ls(env)
  sum(unlist(mget(lst[lst >= x1 & lst <= x3], envir = env)))
}
eval(parse(text = gsubfn(pat, Sum, s)))
## [1] TRUE

这篇关于R:使用正则表达式定义文本范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆