如何转换data.table的多个列和值? [英] How to cast multiple columns and values of a data.table?
本文介绍了如何转换data.table的多个列和值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我的数据结构如下:
DT < - data.table(Id = c(1,1, 1,10,100,100,101,101,101),
Date = as.Date(c(1997-01-01,1997-01-02,1997-01- 03,1997-01-04,
1997-01-02,1997-01-02,1997-01-04,1997-01-03,
1997-01-04,1997-01-04)),
group = c(1,1,1,1,1,2,2,2,2,2),
Price.1 = c(29,25,14,26,30,16,13,62,12,6),
Price.2 = c(4,5,6,6,8,2 ,3,5,7,8))
> DT
Id日期组价格1价格2
1:1 1997-01-01 1 29 4
2:1 1997-01-02 1 25 5
3:1 1997-01-03 1 14 6
4:1 1997-01-04 1 26 6
5: 10 1997-01-02 1 30 8
6:100 1997-01-02 2 16 2
7:100 1997-01-04 2 13 3
8:101 1997-01- 03 2 62 5
9:101 1997-01-04 2 12 7
10:101 1997-01-04 2 6 8
我试图投射它(使用dcast.data.table):
dcast.data.table(DT,Id〜Date,fun = sum,value.var =Price.1)
dcast.data.table(DT,Id〜group,fun = sum,value。 var =Price.1)
dcast.data.table(DT,Id〜Date,fun = sum,value.var =Price.2)
dcast.data.table Id〜group,fun = sum,value.var =Price.2)
4个单独的输出我试图得到以下:
Id 1997-01-01 1997-01-02 1997-01- 03 1997-01-04 1 2价格
pre>
1:1 29 25 14 26 94 0价格1
2:10 0 30 0 0 30 0价格1
3:100 0 16 0 13 0 29价格1
4:101 0 0 62 18 0 80价格1
5:1 4 5 6 6 21 0价格2
6:10 0 8 0 0 8 0价格2
7:100 0 2 0 3 0 5价格2
8:101 0 0 5 15 0 20价格2
我的工作环境使用rbind,cbind和合并。
cbind(rbind(merge(dcast.data.table(DT,Id〜Date,fun = sum,value.var =Price.1),
dcast.data.table(DT,Id〜 group,fun = sum,value.var =Price.1)by byId,all.x = T),
merge(dcast.data.table(DT,Id〜Date,fun = sum,value.var =Price.2),
dcast.data.table(DT,Id_group,fun = sum,value.var =Price.2),by =Id all.x = T)),
Price = c(Price.1,Price.1,Price.1,Price.1,Price.2,Price.2 ,Price.2,Price.2))
我假设每个
Id
映射到一个独特的组
,并摆脱该变量,但在其他方面本质上与@ user227710的答案相同。Idg < - unique(DT [,。(Id,group)])
DT [,group:= NULL]
res ; -dcast(
melt(DT,id.vars = c(Id,Date)),
variable + Id〜Date,
value.var =
fill = 0,
marginins =Date,
fun.aggregate = sum
)
#如果你想让群组回来... 。
setDT(res)#在data.table 1.9.5之前需要,其中使用dcast.data.table是另一个选项
setkey(res,Id)
res [Idg] [order变量,Id)]
它提供
变量Id 1997-01-01 1997-01-02 1997-01-03 1997-01-04(all)group
1:Price.1 1 29 25 14 26 94 1
2:Price.2 1 4 5 6 6 21 1
3:Price.1 10 0 30 0 0 30 1
4:Price.2 10 0 8 0 0 8 1
5:Price.1 100 0 16 0 13 29 2
6:Price.2 100 0 2 0 3 5 2
7:Price.1 101 0 0 62 18 80 2
8:Price.2 101 0 0 5 15 20 2
my data is structured as follows:
DT <- data.table(Id = c(1, 1, 1, 1, 10, 100, 100, 101, 101, 101), Date = as.Date(c("1997-01-01", "1997-01-02", "1997-01-03", "1997-01-04", "1997-01-02", "1997-01-02", "1997-01-04", "1997-01-03", "1997-01-04", "1997-01-04")), group = c(1,1,1,1,1,2,2,2,2,2), Price.1 = c(29, 25, 14, 26, 30, 16, 13, 62, 12, 6), Price.2 = c(4, 5, 6, 6, 8, 2, 3, 5, 7, 8)) >DT Id Date group Price.1 Price.2 1: 1 1997-01-01 1 29 4 2: 1 1997-01-02 1 25 5 3: 1 1997-01-03 1 14 6 4: 1 1997-01-04 1 26 6 5: 10 1997-01-02 1 30 8 6: 100 1997-01-02 2 16 2 7: 100 1997-01-04 2 13 3 8: 101 1997-01-03 2 62 5 9: 101 1997-01-04 2 12 7 10: 101 1997-01-04 2 6 8
I am trying to cast it (using dcast.data.table):
dcast.data.table(DT, Id ~ Date, fun = sum, value.var = "Price.1") dcast.data.table(DT, Id ~ group, fun = sum, value.var = "Price.1") dcast.data.table(DT, Id ~ Date, fun = sum, value.var = "Price.2") dcast.data.table(DT, Id ~ group, fun = sum, value.var = "Price.2")
but rather than 4 separate outputs I am trying to get the following:
Id 1997-01-01 1997-01-02 1997-01-03 1997-01-04 1 2 Price 1: 1 29 25 14 26 94 0 Price.1 2: 10 0 30 0 0 30 0 Price.1 3: 100 0 16 0 13 0 29 Price.1 4: 101 0 0 62 18 0 80 Price.1 5: 1 4 5 6 6 21 0 Price.2 6: 10 0 8 0 0 8 0 Price.2 7: 100 0 2 0 3 0 5 Price.2 8: 101 0 0 5 15 0 20 Price.2
and my work-around uses rbind, cbind, and merge.
cbind(rbind(merge(dcast.data.table(DT, Id ~ Date, fun = sum, value.var = "Price.1"), dcast.data.table(DT, Id ~ group, fun = sum, value.var = "Price.1"), by = "Id", all.x = T), merge(dcast.data.table(DT, Id ~ Date, fun = sum, value.var = "Price.2"), dcast.data.table(DT, Id ~ group, fun = sum, value.var = "Price.2"), by = "Id", all.x = T)), Price = c("Price.1","Price.1","Price.1","Price.1","Price.2","Price.2","Price.2","Price.2"))
Is there an existing and cleaner way to do this?
解决方案I make the assumption that each
Id
maps to a uniquegroup
and get rid of that variable, but otherwise this is essentially the same as @user227710's answer.Idg <- unique(DT[,.(Id,group)]) DT[,group:=NULL] res <- dcast( melt(DT, id.vars = c("Id","Date")), variable+Id ~ Date, value.var = "value", fill = 0, margins = "Date", fun.aggregate = sum ) # and if you want the group back... setDT(res) # needed before data.table 1.9.5, where using dcast.data.table is another option setkey(res,Id) res[Idg][order(variable,Id)]
which gives
variable Id 1997-01-01 1997-01-02 1997-01-03 1997-01-04 (all) group 1: Price.1 1 29 25 14 26 94 1 2: Price.2 1 4 5 6 6 21 1 3: Price.1 10 0 30 0 0 30 1 4: Price.2 10 0 8 0 0 8 1 5: Price.1 100 0 16 0 13 29 2 6: Price.2 100 0 2 0 3 5 2 7: Price.1 101 0 0 62 18 80 2 8: Price.2 101 0 0 5 15 20 2
这篇关于如何转换data.table的多个列和值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文