如何对两个表求和? [英] How to sumif across two tables?
问题描述
我有两个表需要对它们进行求和。表1包含时间段,即年末的季度和季度(即 4
, 8
, 12
等)。表2包含一年中在 3
, 6
, 7
等。
I have two tables that I need to do a sumif across. Table 1 contains time periods, i.e. year and quarter at year end (i.e. 4
, 8
, 12
etc.). Table 2 contains the transactions during the year at quarters 3
, 6
, 7
etc.
我需要表3汇总一年中的所有交易,以便获得年底的累计头寸。
I need Table 3 to sum all the transactions during the year so that I get the cumulative position at year end.
下面是一些示例代码来说明数据的外观以及输出的外观:
Here's some sample code to explain what the data looks like and what the output should look like:
library(data.table)
x1 <- data.table("Name" = "LOB1", "Year" = 2000,
"Quarter" = c(4, 8, 12, 16, 20, 24, 28, 32, 36))
x2 <- data.table("Name" = "LOB1", "Year" = 2000,
"Quarter" = c(3, 6, 7, 9, 11, 14, 16, 20, 24),
"Amount" = c(10000, 15000, -2500, 3500, -6500, 25000,
11000, 9000, 7500))
x3 <- data.table("Name" = "LOB1", "Year" = 2000,
"Quarter" = c(4, 8, 12, 16, 20, 24, 28, 32, 36),
"Amount" = c(10000, 22500, 19500, 55500, 64500, 72000,
72000, 72000, 72000))
我尝试过合并
,摘要
,翻盖
,但不太清楚。
I've tried merge
, summarise
, foverlaps
but can't quite figure it out.
推荐答案
很好的问题。基本上,您想做的是加入 Name
, Year
和 Quarter< =季度
,同时将所有匹配的金额
值相加。都可以使用新的非等额联接(在data.table v-1.10.0的最新稳定版本中引入)和 overlaps
(而后者将可能不是最佳选择)
Nice question. What basically you are trying to do is to join by Name
, Year
and Quarter <= Quarter
, while summing all the matched Amount
values. This is both possible using the new non-equi joins (which were introduced in the latest stable version of data.table v-1.10.0) and foverlaps
(while the latter will be probably sub-optimal)
非Equi联接:
x2[x1, # for each value in `x1` find all the matching values in `x2`
.(Amount = sum(Amount)), # Sum all the matching values in `Amount`
on = .(Name, Year, Quarter <= Quarter), # join conditions
by = .EACHI] # Do the summing per each match in `i`
# Name Year Quarter Amount
# 1: LOB1 2000 4 10000
# 2: LOB1 2000 8 22500
# 3: LOB1 2000 12 19500
# 4: LOB1 2000 16 55500
# 5: LOB1 2000 20 64500
# 6: LOB1 2000 24 72000
# 7: LOB1 2000 28 72000
# 8: LOB1 2000 32 72000
# 9: LOB1 2000 36 72000
作为旁注,您可以轻松地添加 Amoun t
在 x1
中(由@Frank提出):
As a side note, you can easily add Amount
in place in x1
(proposed by @Frank):
x1[, Amount :=
x2[x1, sum(x.Amount), on = .(Name, Year, Quarter <= Quarter), by = .EACHI]$V1
]
如果该表中不止三个联接列,可能会很方便。
This might be convenient if you have more than just the three join columns in that table.
覆盖范围:
您提到了翻盖
,因此从理论上讲,您也可以使用此功能实现相同的功能。尽管我怕您会很容易地失去记忆。使用 foverlaps
,您将需要创建一个巨大的表,其中 x2
中的每个值多次与 x1
并将所有内容存储在内存中
You mentioned foverlaps
, so in theory you could achieve the same using this function too. Though I'm afraid you will easily get out of memory. Using foverlaps
, you will need to create a huge table where each value in x2
joined multiple times to each value in x1
and store everything in memory
x1[, Start := 0] # Make sure that we always join starting from Q0
x2[, Start := Quarter] # In x2 we want to join all possible rows each time
setkey(x2, Name, Year, Start, Quarter) # set keys
## Make a huge cartesian join by overlaps and then aggregate
foverlaps(x1, x2)[, .(Amount = sum(Amount)), by = .(Name, Year, Quarter = i.Quarter)]
# Name Year Quarter Amount
# 1: LOB1 2000 4 10000
# 2: LOB1 2000 8 22500
# 3: LOB1 2000 12 19500
# 4: LOB1 2000 16 55500
# 5: LOB1 2000 20 64500
# 6: LOB1 2000 24 72000
# 7: LOB1 2000 28 72000
# 8: LOB1 2000 32 72000
# 9: LOB1 2000 36 72000
这篇关于如何对两个表求和?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!