使用data.table在每行中指定值范围内的值计数 [英] Count of values within specified range of value in each row using data.table

查看:79
本文介绍了使用data.table在每行中指定值范围内的值计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

要为分类变量的每个级别(或级别的组合)计算一列计数,可以使用
来处理表语法,例如:

To come up with a column of counts for each level (or combination of levels) for categorical variables is data.table syntax can be handled with something like:

#setting up the data so it's pasteable
df <- data.table(var1 = c('dog','cat','dog','cat','dog','dog','dog'),
                 var2 = c(1,5,90,95,91,110,8),
                 var3 = c('lamp','lamp','lamp','table','table','table','table'))

#adding a count column for var1
df[, var1count := .N, by = .(var1)]

#adding a count of each combo of var1 and var3
df[, var1and3comb := .N, by = .(var1,var3)]

我很好奇我如何可以产生一个count列来计算带有在var2的每个值的+-5范围内。

I am curious as to how I could instead produce a count column that counts the number of records with a value that is within +- 5 from each value of var2.

在我对此无法正常工作的尝试中,

In my non-functioning attempt at this,

df[, var2withinrange := .N, by = .(between((var2-5),(var2+5),var2))]

我得到一列记录总数,而不是期望的结果。我希望第一行的值保持为2,因为1和5属于该范围。第2行的值应为3,因为1、5和8都落在了5的范围内,依此类推。

I get a column with the total number of records as opposed to the desired result. I'd be hoping for the first row to hold a value of 2, since the 1 and 5 fall into that range. Row 2 should have a value of 3, since the 1, 5, and 8 all fall into that range for the 5, and so on.

任何帮助解决方案倍受赞赏。理想的是使用data.table代码!

Any help on coming up with a solution is much appreciated. Ideally in data.table code!

推荐答案

具有

df[, var2withinrange := df[.(var2min = var2 - 5, var2plus = var2 + 5)
                           , on = .(var2 >= var2min, var2 <= var2plus)
                           , .N
                           , by = .EACHI][, N]][]

给出:


> df
   var1 var2  var3 var2withinrange
1:  dog    1  lamp               2
2:  cat    5  lamp               3
3:  dog   90  lamp               3
4:  cat   95 table               3
5:  dog   91 table               3
6:  dog  110 table               1
7:  dog    8 table               2


这篇关于使用data.table在每行中指定值范围内的值计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆