将变量和名称传递给 data.table 函数 [英] pass variables and names to data.table function

查看:16
本文介绍了将变量和名称传递给 data.table 函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一份报告需要应用于不同名称的 data.tables [j 和 by].我通过将参数包装在 eval(substitute(value)) 函数中来完成它的唯一方法.这会降低代码的可读性.我已将 j 参数命名为变量",但我想将函数的 j 参数传递给 setnames 函数.

I have a report that needs to be applied for different names of data.tables [both j and by]. The only way I get it done it by wrapping the arguments in an eval(substitute(value)) function. This makes the code less readable. I have named the j argument "variable", but I would like to pass the j argument of the function to the setnames functions.

所以,问题是:

有没有办法避免 eval(substitute(value)) 构造?

is there a way to avoid the eval(substitute(value)) construction?

我可以将 j 参数传递给 setnames 函数吗?

can I pass the j argument to the setnames function?

library(data.table)
library(ggplot2)
data(diamonds, package = "ggplot2")
dt = as.data.table(diamonds)

var.report = function(df, value, by.value) {
  var.report = df[, list( .N,
                    sum(is.finite(eval(substitute(value)))), # count values
                    sum(is.na(eval(substitute(value)))) # count NA
  ), by = eval(substitute(by.value))]

  setnames(var.report, c("variable", "N","n.val","n.NA"))

  return(var.report)
}


var.report(dt, depth, clarity)

推荐答案

eval(substitute'ing 整个函数体(或者只是 data.table计算如果你想更具体):

How about eval(substitute'ing the entire body of the function (or just data.table calculation if you want to be more specific):

var.report = function(df, value, by.value) {
  eval(substitute({
    var.report = df[, list( .N,
                      sum(is.finite(value)), # count values
                      sum(is.na(value)) # count NA
    ), by = by.value]

    setnames(var.report, c("variable", "N","n.val","n.NA"))

    return(var.report)
  }))
}

var.report(dt, depth, clarity)
#   variable     N n.val n.NA
#1:      SI2  9194  9194    0
#2:      SI1 13065 13065    0
#3:      VS1  8171  8171    0
#4:      VS2 12258 12258    0
#5:     VVS2  5066  5066    0
#6:     VVS1  3655  3655    0
#7:       I1   741   741    0
#8:       IF  1790  1790    0

我不太明白第二个问题,我通常会在原始表达式中指定名称,这有助于更好地跟踪事物,如下所示:

I don't really understand the second question and I'd normally assign the names in the original expression, which helps keeping track of things better, like so:

var.report = df[, list(N     = .N,
                       n.val = sum(is.finite(value)), # count values
                       n.NA  = sum(is.na(value)) # count NA
                      )
                , by = list(variable = by.value)]

这篇关于将变量和名称传递给 data.table 函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆