将部分数据表的总和作为另一个数据表的列 [英] Sum partial datatable as column for another datatable

查看:39
本文介绍了将部分数据表的总和作为另一个数据表的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据表。

> dt = data.table(game=c(1, 2, 3, 1, 2, 3, 1, 2, 3),
... player=c("ace", "ace", "ace", "bob", "bob", "bob", "casey", "casey", "casey"),
... points=c(5, 2, 3, 2, 6, 7, 3, 4, 2))
> dt
   game player points
1:    1    ace      5
2:    2    ace      2
3:    3    ace      3
4:    1    bob      2
5:    2    bob      6
6:    3    bob      7
7:    1  casey      3
8:    2  casey      4
9:    3  casey      2

> out = data.table(start=c(1, 1, 3),
... end=c(2, 2, 3),
... player=c("ace", "bob", "casey"))
> out
   start end player
1:     1   2    ace
2:     1   2    bob
3:     3   3  casey
> ???
> ???
> out 
   start end player points
1:     1   2    ace      7
2:     1   2    bob      8
3:     3   3  casey      2

非R方法是迭代每一行,过滤玩家和游戏编号,以使dt中的游戏编号大于开始,小于结束,然后将dt中的points列求和并放入新的列中。

The non-R way would be to iterate each row, filter for the player and the game numbers such that the game number in dt is greater than start and less than end, then sum the points column from dt and put it into a new column in out.

在R中做到这一点的最佳方法是什么?

What is the best way to do this in R?

推荐答案

您可以在 data.table 中使用非等价联接,然后加总

You can use non-equi join in data.table and then sum points which overlap in the range.

library(data.table)

dt[out, .(start, end, game, player, points), 
         on=.(player, game>=start, game<=end)][
       , .(points = sum(points)), by = .(start, end, player)]

#   start end player points
#1:     1   2    ace      7
#2:     1   2    bob      8
#3:     3   3  casey      2

这篇关于将部分数据表的总和作为另一个数据表的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆