总结所有列 [英] Summarise over all columns

查看:132
本文介绍了总结所有列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下格式的数据:

gen = function () sample.int(10, replace = TRUE)
x = data.frame(A = gen(), C = gen(), G = gen(), T = gen())

我现在想要附加到每一行中行中所有元素的总和(我的实际函数更复杂,但 sum 说明问题)。

I would now like to attach, to each row, the total sum of all the elements in the row (my actual function is more complex but sum illustrates the problem).

没有dplyr,我会写

Without dplyr, I’d write

cbind(x, Sum = apply(x, 1, sum))

导致:

   A C  G T Sum
1  3 1  6 9  19
2  3 4  3 3  13
3  3 1 10 5  19
4  7 2  1 6  16
…



But it seems surprisingly hard to do this with dplyr.

我尝试过

x %>% rowwise() %>% mutate(Sum = sum(A : T))

但是结果不是每一列的列的总和,这是意想不到的(对我来说)

But the result is not the sum of the columns of each row, it’s something unexpected and (to me) inexplicable.

我也尝试过

x %>% rowwise() %>% mutate(Sum = sum(.))

code>。仅仅是整个 x 的占位符。提供 no 参数,不出意料,也不起作用(结果都是 0 )。不用说,没有这些变体在没有 rowwise()的情况下工作。

But here, . is simply a placeholder for the whole x. Providing no argument does, unsurprisingly, also not work (results are all 0). Needless to say, none of these variants works without rowwise(), either.

(没有真的任何理由必须在dplyr中执行此操作,但(a)我希望保持我的代码尽可能的一致,并且在不同的API之间跳转并没有帮助;(b)我希望有一天可以自动执行在dplyr中自由并行这样的命令。)

(There isn’t really any reason to necessarily do this in dplyr, but (a) I’d like to keep my code as uniform as possible, and jumping between different APIs doesn’t help; and (b) I’m hoping to one day get automatic and free parallelisation of such commands in dplyr.)

推荐答案

我曾经做过类似的事情,那时候我结束了: / p>

I once did something similar, and by that time I ended up with:

x %>%
  rowwise() %>%
  do(data.frame(., res = sum(unlist(.))))
#    A  C G  T res
# 1  3  2 8  6  19
# 2  6  1 7 10  24
# 3  4  8 6  7  25
# 4  6  4 7  8  25
# 5  6 10 7  2  25
# 6  7  1 2  2  12
# 7  5  4 8  5  22
# 8  9  2 3  2  16
# 9  3  4 7  6  20
# 10 7  5 3  9  24






也许您的更复杂的功能可以正常工作,没有 unlist ,但它ems就像 sum 一样。因为是指当前组,我最初以为 rowwise 机械中的第一行将对应于 x [1,] ,它是一个列表,其中 sum 之外愉快地吞下


Perhaps your more complex function works fine without unlist, but it seems like it is necessary for sum. Because . refers to the "current group", I initially thought that . for e.g. the first row in the rowwise machinery would correspond to x[1, ], which is a list, which sum swallows happily outside do

is.list((x[1, ]))
# [1] TRUE

sum(x[1, ])
# [1] 19 

但是,在$ $ c $中没有 unlist c> do 生成错误,我不知道为什么:

However, without unlist in do an error is generated, and I am not sure why:

x %>%
  rowwise() %>%
  do(data.frame(., res = sum(.)))
# Error in sum(.) : invalid 'type' (list) of argument

这篇关于总结所有列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆