总结所有列 [英] Summarise over all columns
问题描述
我有以下格式的数据:
gen = function () sample.int(10, replace = TRUE)
x = data.frame(A = gen(), C = gen(), G = gen(), T = gen())
我现在想要附加到每一行中行中所有元素的总和(我的实际函数更复杂,但 sum
说明问题)。
I would now like to attach, to each row, the total sum of all the elements in the row (my actual function is more complex but sum
illustrates the problem).
没有dplyr,我会写
Without dplyr, I’d write
cbind(x, Sum = apply(x, 1, sum))
导致:
A C G T Sum
1 3 1 6 9 19
2 3 4 3 3 13
3 3 1 10 5 19
4 7 2 1 6 16
…
But it seems surprisingly hard to do this with dplyr.
我尝试过
x %>% rowwise() %>% mutate(Sum = sum(A : T))
但是结果不是每一列的列的总和,这是意想不到的(对我来说)
But the result is not the sum of the columns of each row, it’s something unexpected and (to me) inexplicable.
我也尝试过
x %>% rowwise() %>% mutate(Sum = sum(.))
code>。仅仅是整个 x
的占位符。提供 no 参数,不出意料,也不起作用(结果都是 0
)。不用说,没有这些变体在没有 rowwise()
的情况下工作。
But here, .
is simply a placeholder for the whole x
. Providing no argument does, unsurprisingly, also not work (results are all 0
). Needless to say, none of these variants works without rowwise()
, either.
(没有真的任何理由必须在dplyr中执行此操作,但(a)我希望保持我的代码尽可能的一致,并且在不同的API之间跳转并没有帮助;(b)我希望有一天可以自动执行在dplyr中自由并行这样的命令。)
(There isn’t really any reason to necessarily do this in dplyr, but (a) I’d like to keep my code as uniform as possible, and jumping between different APIs doesn’t help; and (b) I’m hoping to one day get automatic and free parallelisation of such commands in dplyr.)
推荐答案
我曾经做过类似的事情,那时候我结束了: / p>
I once did something similar, and by that time I ended up with:
x %>%
rowwise() %>%
do(data.frame(., res = sum(unlist(.))))
# A C G T res
# 1 3 2 8 6 19
# 2 6 1 7 10 24
# 3 4 8 6 7 25
# 4 6 4 7 8 25
# 5 6 10 7 2 25
# 6 7 1 2 2 12
# 7 5 4 8 5 22
# 8 9 2 3 2 16
# 9 3 4 7 6 20
# 10 7 5 3 9 24
也许您的更复杂的功能可以正常工作,没有 unlist
,但它ems就像 sum
一样。因为。
是指当前组,我最初以为。
rowwise
机械中的第一行将对应于 x [1,]
,它是一个列表,其中 sum
在之外愉快地吞下
Perhaps your more complex function works fine without unlist
, but it seems like it is necessary for sum
. Because .
refers to the "current group", I initially thought that .
for e.g. the first row in the rowwise
machinery would correspond to x[1, ]
, which is a list, which sum
swallows happily outside do
is.list((x[1, ]))
# [1] TRUE
sum(x[1, ])
# [1] 19
但是,在$ $ c $中没有 unlist
c> do 生成错误,我不知道为什么:
However, without unlist
in do
an error is generated, and I am not sure why:
x %>%
rowwise() %>%
do(data.frame(., res = sum(.)))
# Error in sum(.) : invalid 'type' (list) of argument
这篇关于总结所有列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!