do.call来构建和执行data.table命令 [英] do.call to build and execute data.table commands

查看:90
本文介绍了do.call来构建和执行data.table命令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个小的data.table,每个测试单元代表一个记录(AB测试结果),并且我想添加更多列以将每个测试单元与其他测试单元进行比较.换句话说,我要添加的列数将取决于所涉及的AB测试中有多少个测试单元.

I have a small data.table representing one record per test cell (AB testing results) and am wanting to add several more columns that compare each test cell, against each other test cell. In other words, the number of columns I want to add, will depend upon how many test cells are in the AB test in question.

我的data.table看起来像:

Group   Delta     SD.diff
Control     0           0
Cell1 0.00200 0.001096139
Cell2 0.00196 0.001095797
Cell3 0.00210 0.001096992
Cell4 0.00160 0.001092716

我想添加以下列(这里的数字是垃圾):

And I want to add the following columns (numbers are trash here):

Group v.Cell1    v.Cell2   v.Cell3   v.Cell4
Control  0.45       0.41      0.45      0.41 
Cell1    0.50       0.58      0.48      0.66
Cell2    0.58       0.50      0.58      0.48
Cell3    0.48       0.58      0.50      0.70
Cell4    0.66       0.48      0.70      0.50

我确信do.call是可行的方法,但是我无法解决如何嵌入一个do.call到另一个内部来生成脚本的问题……而我也无法解决如何执行脚本的问题. (共20行).我目前最接近的是:

I am sure that do.call is the way to go, but I cant work out how to embed one do.call inside another to generate the script... and I can't work out how to then execute the scripts (20 lines in total). The closest I am currently is:

a <- do.call("paste",c("test.1.results <- mutate(test.1.results, P.Better.",list(unlist(test.1.results[,Group]))," = pnorm(Delta, test.1.results['",list(unlist(test.1.results[,Group])),"'][,Delta], SD.diff,lower.tail=TRUE))", sep=""))

哪个会产生5条脚本行,例如:

Which produces 5 script lines like:

test.1.results <- mutate(test.1.results, P.Better.Cell2 = pnorm(Delta, test.1.results['Cell2'][,Delta], SD.diff,lower.tail=TRUE))

仅将一个测试单元格结果与其自身进行比较.0.50结果(由于偶然性而导致的差异).毫无用处,因为我需要将每个测试相互比较.

Which only compares one test cell results against itself.. a 0.50 result (difference due to chance). No use what so ever as I need each test compared to each other.

不知道该去哪里.

推荐答案

更新:在 v1.8.11 新闻:

Update: In v1.8.11, FR #2077 is now implemented - set() can now add columns by reference, . From NEWS:

set()现在可以通过引用添加新列.例如,set(DT, i=3:5, j="bla", 5L)等效于DT[3:5, bla := 5L].这是FR #2077.已添加测试.

set() is able to add new columns by reference now. For example, set(DT, i=3:5, j="bla", 5L) is equivalent to DT[3:5, bla := 5L]. This was FR #2077. Tests added.


使用set()通常可以简化这样的任务.为了证明这一点,这是您在问题中所拥有(未经测试)的译文.但是我意识到您想要的东西与您发布的东西有所不同(我对此不太了解,很快).


Tasks like this are often easier with set(). To demonstrate, here's a translation of what you have in the question (untested). But I realise you want something different than what you've posted (which I don't quite understand, quickly).

for (i in paste0("Cell",1:4))
  set(DT,                   # the data.table to update/add column by reference
    i=NULL,                 # no row subset, NULL is default anyway
    j=paste("P.Better.",i), # column name or position. must be name when adding
    value = pnorm(DT$Delta, DT[i][,Delta], DT$SD.diff, lower.tail=TRUE)

请注意,您只能添加新列的一个子集,其余的将用NA填充.都使用:=set.

Note that you can add only a subset of a new column and the rest will be filled with NA. Both with := and set.

这篇关于do.call来构建和执行data.table命令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆