R data.table:为所有行中有条件的行子集添加新列 [英] R data.table: adding new column for subset of rows conditional on all rows

查看:165
本文介绍了R data.table:为所有行中有条件的行子集添加新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任务:对于所有 condition == FALSE ,将groupmean设置为所有个数字的均值
对于所有 condition == TRUE ,仅在的情况下,将groupmean设置为个数字的均值的condition == TRUE
我想有一个解决方案,它不需要复制整个data.table,而是将所需的列添加到位。我敢打赌,这里有一个简单的解决方案,但是我有点迷路了……

Task: For all condition==FALSE, set groupmean to mean of all numbers by group. For all condition==TRUE set groupmean to mean of numbers only where condition==TRUE by group. I would like to have a solution which does not require copying the whole data.table but adds the desired column in place. I bet there's a plain simple solution, but I got lost a little...

到目前为止我的尝试:

set.seed(42)
require(data.table)

DT <- data.table(condition=sample(c(TRUE,FALSE), 50, replace=T),
                 group=rep(LETTERS[1:4], times=25),
                 numbers=1:100)

# modifies the right rows, but wrong value
DT[condition==FALSE, groupmean_1 := mean(numbers), by=group]

# right values, but not only rows where condition=FALSE
DT[, groupmean_2 := mean(numbers), by=group]

head(DT)
     condition group numbers groupmean_1 groupmean_2
1:     FALSE     A       1    42.66667          49
2:     FALSE     B       2    55.68421          50
3:      TRUE     C       3          NA          51
4:     FALSE     D       4    47.78947          52
5:     FALSE     A       5    42.66667          49
6:     FALSE     B       6    55.68421          50


推荐答案

您应该颠倒定义 groupmean 。将其计算为所有行的组平均值,然后替换 condition == TRUE 之后的行。

You should reverse the sequence of how you define groupmean. Compute it as the group average for all rows, and substitute the rows where condition == TRUE afterwards.

DT[, groupmean:=mean(numbers), by=group]
DT[condition==TRUE, groupmean:=mean(numbers), by='group,condition']

我希望对您有所帮助

这篇关于R data.table:为所有行中有条件的行子集添加新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆