R colSums 按组 [英] R colSums By Group

查看:77
本文介绍了R colSums 按组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下矩阵数据集中:

       1  2   3   4   5  
1950   7 20  21  15  61  
1951   2 10   6  26  57  
1952  12 27  43  37  34  
1953  14 16  40  47  94  
1954   2 17  62 113 101  
1955   3  4  43  99 148  
1956   2 47  31  85  79  
1957  17  5  38 216 228  
1958  11 20  15  76  68  
1959  16 20  43  30 226  
1960   9 28  28  70 201  
1961   1 31 124  74 137  
1962  12 25  37  41 200  

我一直在尝试按十年计算 colSums,即找到 1950-1959 年和 1960-69 年每列的总和,依此类推.

I have been trying to calculate colSums by decade i.e., find sum the each column from 1950-1959 and then from 1960-69 and so on.

我尝试过 tapply、ddply 等,但无法找出实际可行的方法.

I tried tapply, ddply, etc but couldn't figure out something that would actually work.

推荐答案

首先我们设置用作输入的矩阵.

First we set up the matrix used as input.

Lines <- "1  2   3   4   5  
1950   7 20  21  15  61  
1951   2 10   6  26  57  
1952  12 27  43  37  34  
1953  14 16  40  47  94  
1954   2 17  62 113 101  
1955   3  4  43  99 148  
1956   2 47  31  85  79  
1957  17  5  38 216 228  
1958  11 20  15  76  68  
1959  16 20  43  30 226  
1960   9 28  28  70 201  
1961   1 31 124  74 137  
1962  12 25  37  41 200  "
DF <- read.table(text = Lines, check.names = FALSE)
m <- as.matrix(DF)

现在,我们将在下面展示一些替代解决方案.(1) 似乎是最灵活的,因为我们可以很容易地用其他函数替换 sum 以获得不同的效果,但 (2) 对于这个特定问题是最短的.另请注意,有一些细微的差异.(1) 产生一个 data.frame 而另外两个产生一个矩阵.

Now, below, we show some alternative solutions. (1) seems the most flexible in that we can easily replace sum with other functions to get different effects but (2) is the shortest for this particular problem. Also note that there are some slight differences. (1) produces a data.frame while the other two produce a matrix.

1) aggregate

decade <- 10 * as.numeric(rownames(m)) %/% 10
m.ag <- aggregate(m, data.frame(decade), sum)

它给出了这个 data.frame:

which gives this data.frame:

> m.ag
  decade  1   2   3   4    5
1   1950 86 186 342 744 1096
2   1960 22  84 189 185  538

2) rowsum 这个比较短.它产生一个矩阵结果.

2) rowsum This one is shorter. It produces a matrix result.

rowsum(m, decade)

3) split/sapply.这也产生一个矩阵.如果我们有 DF 我们可以用 DF 替换 as.data.frame(m) 稍微缩短它.

3) split/sapply. This one produces a matrix as well. if we had DF we could replace as.data.frame(m) with DF shortening it slightly.

t(sapply(split(as.data.frame(m), decade), colSums))

添加了解决方案 (2) 和 (3) 添加了一些说明.

added solutions (2) and (3) Added some clarifications.

这篇关于R colSums 按组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆