随机从行中求和值,并将其分配给R中的2列 [英] randomly sum values from rows and assign them to 2 columns in R

查看:180
本文介绍了随机从行中求和值,并将其分配给R中的2列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含8列数据的框架。一个是主题列表(每个主题一行),另外7行是1或0的分数。
这是数据的样子:

 > head(splitkscores)
subject block3 block4 block5 block6 block7 block8 block9
1 40002 0 0 1 0 0 0 0
2 40002 0 0 1 0 0 1 1
3 40002 1 1 1 1 1 1 1
4 40002 1 1 0 0 0 1 0
5 40002 0 1 0 0 0 1 1
6 40002 0 1 1 0 1 1 1

我想用3列创建一个data.frame。主题一列。在其他两列中,必须具有我的数据框架(主题除外)的每一行的3或4个随机选择的数字之和,另一列必须具有在第一个列中未选择的剩余值的总和随机抽样。



帮助非常感谢。
提前感谢

解决方案

这是一个整洁而又整洁的解决方案,没有不必要的复杂性(假设输入被称为 df

  selected = sort(sample(setdiff(colnames(df) ,subject),sample(c(3,4),1)))
notchosen = setdiff(colnames(df),c(subject,selected))
out = data.frame (subject = df $ subject,
sum1 = apply(df [,selected],1,sum),sum2 = apply(df [,notchosen],1,sum))

简单的英文:来自subject以外的列名称的样本,选择3或4的样本大小,并调用这些列名选择;将 notchosen 定义为其他列(不包括主题,显然);然后返回具有主题列表的数据帧,所选列的总和和未选择的列的总和。完成。


I have a data.frame with 8 columns. One is for the list of subjects (one row per subject) and the other 7 rows are a score of either 1 or 0. This is what the data looks like:

>head(splitkscores)
  subject block3 block4 block5 block6 block7 block8 block9
1   40002      0      0      1      0      0      0      0
2   40002      0      0      1      0      0      1      1
3   40002      1      1      1      1      1      1      1
4   40002      1      1      0      0      0      1      0
5   40002      0      1      0      0      0      1      1
6   40002      0      1      1      0      1      1      1

I want to create a data.frame with 3 columns. One column for subjects. In the other two columns, one must have the sum of 3 or 4 randomly chosen numbers from each row of my data.frame (except the subject) and the other column must have the sum of the remaining values which were not chosen in the first random sample.

Help is much appreciated. Thanks in advance

解决方案

Here's a neat and tidy solution free of unnecessary complexity (assume the input is called df):

chosen=sort(sample(setdiff(colnames(df),"subject"),sample(c(3,4),1)))
notchosen=setdiff(colnames(df),c("subject",chosen))
out=data.frame(subject=df$subject,
               sum1=apply(df[,chosen],1,sum),sum2=apply(df[,notchosen],1,sum))

In plain English: sample from the column names other than "subject", choosing a sample size of either 3 or 4, and call those column names chosen; define notchosen to be the other columns (excluding "subject" again, obviously); then return a data frame with the list of subjects, the sum of the chosen columns, and the sum of the non-chosen columns. Done.

这篇关于随机从行中求和值,并将其分配给R中的2列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆