折叠矩阵以将一列中的值与另一列中的值求和 [英] Collapse a matrix to sum values in one column by values in another
问题描述
我有一个包含三列的矩阵:县,日期和急诊室就诊次数.每个县的日期都重复一次,就像这样(只是一个例子):
I have a matrix with three columns: county, date, and number of ED visits. The dates repeat for each county, like this (just an example):
County A 1/1/2012 2
County A 1/2/2012 0
County A 1/3/2012 5
... etc.
County B 1/1/2012 3
County B 1/2/2012 4
... etc.
我想折叠一下此矩阵,以汇总每个日期所有县的访问量.所以看起来像这样:
I would like to collapse this matrix to sum the visits from all counties for each date. So it would look like this:
1/1/2012 5
1/2/2012 4
etc.
我试图在R中使用"table()"
函数,但似乎无法以这种方式按日期对它进行操作.当我执行"table(dt$date, dt$Visits)"
时,它会给我一个频率表,如下所示:
I am trying to use the "table()"
function in R but can't seem to get it to operate on visits by date in this manner. When I do "table(dt$date, dt$Visits)"
it gives me a table of frequencies like this:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
2011-01-01 3 1 2 0 1 1 0 2 0 0 0 0 0 0 0 0
2011-01-02 2 3 1 0 0 1 0 0 1 0 2 0 0 0 0 0
2011-01-03 3 1 1 2 1 0 0 0 0 1 0 0 0 0 1 0
有什么建议吗?有没有更好的函数可以使用,也许是某种和"?
Any suggestions? Is there a better function to use, perhaps a "sum" of some sort?
谢谢!
推荐答案
如@DWin所述,table()
不是用于求和,而是用于记录计数.
As @DWin states, table()
is not for summation, but for record counts.
我给出了使用plyr
,data.table
和aggregate
all_data <- expand.grid(country = paste('Country', LETTERS[1:3]),
date = seq(as.Date('2012/01/01'), as.Date('2012/12/31'), by = 1) )
all_data[['ed_visits']] <- rpois(nrow(all_data), lambda = 5)
# using plyr
library(plyr)
by_date_plyr <- ddply(all_data, .(date), summarize, visits = sum(ed_visits))
# using data.table
library(data.table)
all_DT <- data.table(all_data)
by_date_dt <- all_DT[, list(visits = sum(ed_visits)), by = 'date' ]
# using aggregate
by_date_base <- aggregate(ed_visits ~ date, data = all_data, sum)
这篇关于折叠矩阵以将一列中的值与另一列中的值求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!