使用ddply合并基于列的二进制数据行 [英] Merging rows of binary data based on columns using ddply

查看：80 发布时间：2020/5/28 20:33:01 r sum plyr

本文介绍了使用ddply合并基于列的二进制数据行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下数据框，我要针对这些数据框将一定数量的行中的二进制值合并在一起.

I have the following dataframe for which I want merge together binary values from an amount of rows.

df =data.frame(ID=c(rep("A",5),rep("B",5)), nr=c(rep("2",5),rep("3",5)), replicate(10,sample(0:1,10,rep=TRUE)))

eg:

# ID nr X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# A  2  0  0  1  1  1  1  1  1  1   0
# A  2  1  0  0  0  0  0  0  1  0   1
# A  2  0  0  1  1  1  0  0  0  0   1
# A  2  0  0  0  0  0  1  1  1  0   1
# A  2  0  0  0  1  0  1  1  0  1   1
# B  3  0  1  0  0  1  0  0  0  1   1
# B  3  1  1  0  0  0  0  0  0  0   1
# B  3  1  0  1  0  0  0  1  1  0   1
# B  3  1  1  1  0  1  0  0  1  1   1
# B  3  0  0  0  1  0  0  0  1  0   1

现在，在这种情况下，我想合并前2列的行:

Now I want to merge rows for the first 2 columns in this case:

df2 = ddply(df, c(1:2), summarise, numcolwise(sum,c(3:12)))

但是出现以下错误:

Error in vector(type, length) : 
   vector: cannot make a vector of mode 'closure'.

我还希望将大于1的任何值重置为1，以使其保持二进制状态，但是由于我无法克服错误，所以我还没有尝试过.

Also I would want that anything higher than 1 to be reset to 1 to keep it binary, but since I couldn't get past the error I haven't tried it yet.

我知道之前曾有人问过这个问题的变体，但我以前从未发现过这样的问题.请记住，我要使用列索引，因为我正在处理大数据.

I know variations of this question have been asked before but I haven't found it like this before. Keep in mind that I want to use column indices because I'm working with large data.

推荐答案

如果您的数据很大(如注释中所述)，请忽略plyr，请尝试data.table

If your data is quite large (as mentioned in comments), forget about plyr, try data.table

library(data.table)
setDT(df)[, lapply(.SD, sum), by = list(ID, nr)]

##    ID nr X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
## 1:  A  2  2  3  5  2  5  2  1  3  4   1
## 2:  B  3  3  3  4  1  3  2  3  2  1   4

或者，如果您想坚持使用plyr系列，请继续使用下一代产品:dplyr

Or if you want to stick with the plyr family, move on to the next generation: dplyr

library(dplyr)
df %>%
  group_by(ID, nr) %>%
  summarise_each(funs(sum))

# Source: local data table [2 x 12]
# Groups: ID
# 
#   ID nr X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
# 1  A  2  2  3  5  2  5  2  1  3  4   1
# 2  B  3  3  3  4  1  3  2  3  2  1   4

这篇关于使用ddply合并基于列的二进制数据行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用ddply合并基于列的二进制数据行 [英] Merging rows of binary data based on columns using ddply

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用ddply合并基于列的二进制数据行 [英] Merging rows of binary data based on columns using ddply

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭