跨数据框列表汇总并存储所有结果 [英] Aggregating across list of dataframes and storing all results

查看:90
本文介绍了跨数据框列表汇总并存储所有结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有9个数据帧的列表,每个数据帧有大约100行和5-6列。



我想基于所有数据帧中另一个列中指定的组聚合一个列中的值,并将所有结果存储在单独的数据帧中。为了阐明原因,考虑一个列表

  [[1]] 
日期组年龄
11月A 13
11月A 14
11月B 9
11月10日
[[2]]
日期组年龄
12月C 11
12月C 12
12月E 10

我的代码如下

 用于(i in 1:length(list)){
x <-aggregate(list [[i]] $ Age〜list [[i] ] $ Group,列表[[i]],总和)
x< -rbind(x)
}

但是最后,尽管我试图绑定结果,但x仅包含数据帧2的聚合结果(因为i = 2),而不包含数据帧1的聚合结果。



非常感谢您的帮助。

解决方案

在R中,有许多有效实现的函数可帮助避免为<< c $ c>编写循环。



S Rivero在他的评论中建议使用 lapply()代替 循环并稍后汇总到 rbind()

  do.call (rbind,lapply(dflist,function(x)gregate(Age〜Group,x,sum))))

我的建议是先合并 data.frame s,然后使用 data.table 计算总量:

 库(data.table)
rbindlist(dflist)[,sum(Age),by = Group]




 组V1 
1 :A 27
2:B 9
3:D 10
4:C 23
5:E 10




数据



  dflist< -list(structure(list(Date = c( Nov, Nov, Nov, Nov),Group = c(  A,
A, B, D),Age = c(13L,14L,9L,10L)),.Names = c( Date,
Group , Age),row.names = c(NA,-4L),class = data.frame),
结构(list(Date = c( Dec, Dec, Dec ),Group = c( C,
C, E),Age = c(11L,12L,10L)),.Names = c( Date, Group,
Age),row.names = c(NA,-3L),class = data.frame))


I have a list of 9 data frames, each data frame having approx 100 rows and 5-6 cols.

I want to aggregate the values in a col based on the groups specified in another col across all data frames and store all results in a separate data frame. To elucidate, consider a list

    [[1]]  
    Date  Group  Age
    Nov     A    13
    Nov     A    14
    Nov     B    9
    Nov     D    10
    [[2]]
    Date  Group  Age
    Dec     C    11
    Dec     C    12
    Dec     E    10

My code is as follows

for (i in 1:length(list)){
x<-aggregate(list[[i]]$Age~list[[i]]$Group, list[[i]], sum)
x<-rbind(x)
}

But finally, x contains only the aggregate result from dataframe 2 (since i =2) and not that of dataframe 1, though I am trying to bind the results.

Any help is much appreciated.

解决方案

In R, there are many efficiently implemented functions which help to avoid the hassle of writing for loops.

In his comment, S Rivero has suggested to use lapply() instead of a for loop and to rbind() the aggregates later:

do.call(rbind, lapply(dflist, function(x) aggregate(Age ~ Group, x, sum)))

My suggestion is to combine the data.frames first and then compute the aggregates using data.table:

library(data.table)
rbindlist(dflist)[, sum(Age), by = Group]

   Group V1
1:     A 27
2:     B  9
3:     D 10
4:     C 23
5:     E 10

Data

dflist <- list(structure(list(Date = c("Nov", "Nov", "Nov", "Nov"), Group = c("A", 
"A", "B", "D"), Age = c(13L, 14L, 9L, 10L)), .Names = c("Date", 
"Group", "Age"), row.names = c(NA, -4L), class = "data.frame"), 
    structure(list(Date = c("Dec", "Dec", "Dec"), Group = c("C", 
    "C", "E"), Age = c(11L, 12L, 10L)), .Names = c("Date", "Group", 
    "Age"), row.names = c(NA, -3L), class = "data.frame"))

这篇关于跨数据框列表汇总并存储所有结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆