使用data.table计数和汇总/汇总列 [英] Use data.table to count and aggregate / summarize a column

查看:168
本文介绍了使用data.table计数和汇总/汇总列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对 data.table 中的一列进行计数和汇总,所以找不到最有效的方法。这似乎与我想要的 R用data.table汇总多列

I want to count and aggregate(sum) a column in a data.table, and couldn't find the most efficient way to do this. This seems to be close to what I want R summarizing multiple columns with data.table.

我的数据

set.seed(321)
dat <- data.table(MNTH = c(rep(201501,4), rep(201502,3), rep(201503,5), rep(201504,4)), 
                  VAR = sample(c(0,1), 16, replace=T))

> dat
     MNTH VAR
 1: 201501   1
 2: 201501   1
 3: 201501   0
 4: 201501   0
 5: 201502   0
 6: 201502   0
 7: 201502   0
 8: 201503   0
 9: 201503   0
10: 201503   1
11: 201503   1
12: 201503   0
13: 201504   1
14: 201504   0
15: 201504   1
16: 201504   0

我想通过 MNTH VAR 进行计数和求和使用data.table。预期结果:

I want to both count and sum VAR by MNTH using data.table. The desired result:

    MNTH COUNT VAR
1 201501     4   2
2 201502     3   0
3 201503     5   2
4 201504     4   2


推荐答案

<您所指的帖子提供了一种关于如何将一种聚合方法应用于多个列的方法。如果要将不同的汇总方法应用于不同的列,则可以执行以下操作:

The post you are referring to gives a method on how to apply one aggregation method to several columns. If you want to apply different aggregation methods to different columns, you can do:

dat[, .(count = .N, var = sum(VAR)), by = MNTH]

这将导致:


     MNTH count var
1: 201501     4   2
2: 201502     3   0
3: 201503     5   2
4: 201504     4   2


您还可以通过以下方式通过更新数据集将这些值添加到现有数据集中:

You can also add these values to your existing dataset by updating your dataset by reference:

dat[, `:=` (count = .N, var = sum(VAR)), by = MNTH]


> dat
      MNTH VAR count var
 1: 201501   1     4   2
 2: 201501   1     4   2
 3: 201501   0     4   2
 4: 201501   0     4   2
 5: 201502   0     3   0
 6: 201502   0     3   0
 7: 201502   0     3   0
 8: 201503   0     5   2
 9: 201503   0     5   2
10: 201503   1     5   2
11: 201503   1     5   2
12: 201503   0     5   2
13: 201504   1     4   2
14: 201504   0     4   2
15: 201504   1     4   2
16: 201504   0     4   2


有关如何使用语法,请参见入门指南 在GitHub Wiki上。

For further reading about how to use data.table syntax, see the Getting started guides on the GitHub wiki.

这篇关于使用data.table计数和汇总/汇总列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆