R colSums 按组 [英] R colSums By Group
问题描述
在以下矩阵数据集中:
1 2 3 4 5
1950 7 20 21 15 61
1951 2 10 6 26 57
1952 12 27 43 37 34
1953 14 16 40 47 94
1954 2 17 62 113 101
1955 3 4 43 99 148
1956 2 47 31 85 79
1957 17 5 38 216 228
1958 11 20 15 76 68
1959 16 20 43 30 226
1960 9 28 28 70 201
1961 1 31 124 74 137
1962 12 25 37 41 200
我一直在尝试按十年计算 colSums,即找到 1950-1959 年和 1960-69 年每列的总和,依此类推.
I have been trying to calculate colSums by decade i.e., find sum the each column from 1950-1959 and then from 1960-69 and so on.
我尝试过 tapply、ddply 等,但无法找出实际可行的方法.
I tried tapply, ddply, etc but couldn't figure out something that would actually work.
推荐答案
首先我们设置用作输入的矩阵.
First we set up the matrix used as input.
Lines <- "1 2 3 4 5
1950 7 20 21 15 61
1951 2 10 6 26 57
1952 12 27 43 37 34
1953 14 16 40 47 94
1954 2 17 62 113 101
1955 3 4 43 99 148
1956 2 47 31 85 79
1957 17 5 38 216 228
1958 11 20 15 76 68
1959 16 20 43 30 226
1960 9 28 28 70 201
1961 1 31 124 74 137
1962 12 25 37 41 200 "
DF <- read.table(text = Lines, check.names = FALSE)
m <- as.matrix(DF)
现在,我们将在下面展示一些替代解决方案.(1) 似乎是最灵活的,因为我们可以很容易地用其他函数替换 sum
以获得不同的效果,但 (2) 对于这个特定问题是最短的.另请注意,有一些细微的差异.(1) 产生一个 data.frame 而另外两个产生一个矩阵.
Now, below, we show some alternative solutions. (1) seems the most flexible in that we can easily replace sum
with other functions to get different effects but (2) is the shortest for this particular problem. Also note that there are some slight differences. (1) produces a data.frame while the other two produce a matrix.
1) aggregate
decade <- 10 * as.numeric(rownames(m)) %/% 10
m.ag <- aggregate(m, data.frame(decade), sum)
它给出了这个 data.frame:
which gives this data.frame:
> m.ag
decade 1 2 3 4 5
1 1950 86 186 342 744 1096
2 1960 22 84 189 185 538
2) rowsum
这个比较短.它产生一个矩阵结果.
2) rowsum
This one is shorter. It produces a matrix result.
rowsum(m, decade)
3) split/sapply
.这也产生一个矩阵.如果我们有 DF
我们可以用 DF
替换 as.data.frame(m) 稍微缩短它.
3) split/sapply
. This one produces a matrix as well. if we had DF
we could replace as.data.frame(m) with DF
shortening it slightly.
t(sapply(split(as.data.frame(m), decade), colSums))
添加了解决方案 (2) 和 (3) 添加了一些说明.
added solutions (2) and (3) Added some clarifications.
这篇关于R colSums 按组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!