按年/十年创建每个项目的计数 [英] Create count per item by year/decade

查看:115
本文介绍了按年/十年创建每个项目的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在data.table中有以下数据:

I have data in a data.table that is as follows:

> x<-df[sample(nrow(df), 10),]
> x      

>                   Importer                 Exporter       Date

 1:                 Ecuador                  United Kingdom 2004-01-13
 2:                  Mexico                   United States 2013-11-19
 3:               Australia                   United States 2006-08-11
 4:           United States                   United States 2009-05-04
 5:                   India                   United States 2007-07-16
 6:               Guatemala                       Guatemala 2014-07-02
 7:                  Israel                          Israel 2000-02-22
 8:                   India                   United States 2014-02-11
 9:                    Peru                            Peru 2007-03-26
10:                  Poland                          France 2014-09-15

我试图创建摘要,以便给定一段时间(例如十年),我可以找到每个国家的时间出现为进口商和出口商。因此,在上面的例子中,除以十年时的期望输出应该是:

I am trying to create summaries so that given a time period (say a decade), I can find the number of time each country appears as Importer and Exporter. So, in the above example the desired output when dividing up by decade should be something like:

Decade    Country.Name    Importer.Count         Exporter.Count

2000      Ecuador         1                      0
2000      Mexico          1                      1
2000      Australia       1                      0
2000      United States   1                      3
.
.
.
2010     United States    0                      2
.
.
.

到目前为止,我已经尝试过使用aggregate和data.table方法=http://stackoverflow.com/questions/14641874/summary-of-data-for-each-year-in-r>这里,但他们似乎只是给我计数的数字进口商/出口商每年(或十年,因为我对此更感兴趣)。

So far, I have tried with aggregate and data.table methods as suggested by the post here, but both of them seem to just give me counts of the number Importers/Exporters per year (or decade as I am more interested in that).

> x$Decade<-year(x$Date)-year(x$Date)%%10
> importer_per_yr<-aggregate(Importer ~ Decade, FUN=length, data=x)
> importer_per_yr

   Decade                      Importer

2   2000                       6
3   2010                       4

考虑到aggregate使用公式接口,我尝试添加另一个条件,但得到以下错误:

Considering that aggregate uses the formula interface, I tried adding another criteria, but got the following error:

> importer_per_yr<-aggregate(Importer~ Decade + unique(Importer), FUN=length, data=x)
Error in model.frame.default(formula = Importer ~ Decade +  : 
  variable lengths differ (found for 'unique(Importer)')

有一种方法可以根据十年来创建摘要

Is there a way to create the summary according to the decade and the importer/ exporter? It does not matter if the summary for importer and exporter are in different tables.

推荐答案

我们可以做到这一点使用 data.table 方法,通过赋值:= ,然后创建'通过指定度量列,从到'long'格式的数据融化,使用 dcast ,我们使用 fun.aggregate 作为 length

We can do this using data.table methods, Create the 'Decade' column by assignment :=, then melt the data from 'wide' to 'long' format by specifying the measure columns, reshape it back to 'wide' using dcast and we use the fun.aggregate as length.

x[, Decade:= year(Date) - year(Date) %%10]
dcast(melt(x, measure = c("Importer", "Exporter"), value.name = "Country"), 
                       Decade + Country~variable, length)
#     Decade        Country Importer Exporter
# 1:   2000      Australia        1        0
# 2:   2000        Ecuador        1        0
# 3:   2000          India        1        0
# 4:   2000         Israel        1        1
# 5:   2000           Peru        1        1
# 6:   2000 United Kingdom        0        1
# 7:   2000  United States        1        3
# 8:   2010         France        0        1
# 9:   2010      Guatemala        1        1
#10:   2010          India        1        0
#11:   2010         Mexico        1        0
#12:   2010         Poland        1        0
#13:   2010  United States        0        2

这篇关于按年/十年创建每个项目的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆