从data.frame创建新列 [英] Creating a new columns from a data.frame

查看:170
本文介绍了从data.frame创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个longformat的数据集,其中测量(时间)嵌套在嵌套在人员(ID)中的Networkpartners(NP)中,这里是一个示例,它是真实的数据集有超过数千个行):

  ID NP时间结果
1 11 1 4
1 11 2 3
1 11 3 NA
1 12 1 2
1 12 2 3
1 12 3 3
2 21 1 2
2 21 2 NA
2 21 3 NA
2 22 1 4
2 22 2 4
2 22 3 4

现在我想创建3个新变量:



a)网络伙伴的数量(在此测量结果中没有NA)a具体人(ID)具有时间1



b)网络伙伴的数量(在此测量结果中没有NA)时间2的特定人(ID) / p>

c)网络伙伴的数量(在此测量结果中没有NA)时间3的特定人员(ID) p>

所以我想创建一个这样的数据集:

  ID NP时间结果NP.T1 NP.T2 NP.T3 
1 11 1 4 2 2 1
1 11 2 3 2 2 1
1 11 3 NA 2 2 1
1 12 1 2 2 2 1
1 12 2 3 2 2 1
1 12 3 3 2 2 1
2 21 1 2 2 1 1
2 21 2 NA 2 1 1
2 21 3 NA 2 1 1
2 22 1 4 2 1 1
2 22 2 4 2 1 1
2 22 3 4 2 1 1

我非常感谢你的帮助。

解决方案

您只需创建一个变量,而不是三个。我正在从
的plyr包中使用 ddply

  mydata< -structure(list(ID = c(1L,1L,1L,1L,1L,1L,2L,2L ,2L,2L,
2L,2L),NP = c(11L,11L,11L,12L,12L,12L,21L,21L,21L,
22L,22L,22L),时间= (1L,2L,3L,1L,2L,3L,1L,2L,3L,
1L,2L,3L),Outcome = c(4L,3L,NA,2L,3L,3L,2L, NA,
4L,4L,4L)),.Names = c(ID,NP,时间,结果),class =data.frame,row.names = c NA,
-12L))


库(plyr)
mydata1< -ddply(mydata,。(ID,Time),transform,NP.T =长度(结果[结果!=NA)]))
> mydata1
ID NP时间结果NP.T
1 1 11 1 4 2
2 1 12 1 2 2
3 1 11 2 3 2
4 1 12 2 3 2
5 1 11 3 NA 1
6 1 12 3 3 1
7 2 21 1 2 2
8 2 22 1 4 2
9 2 21 2 NA 1
10 2 22 2 4 1
11 2 21 3 NA 1
12 2 22 3 4 1

更新:您还可以使用交互创建唯一变量组合ID和时间(梳)

  mydata1< -ddply(mydata,。(ID,Time),transform,NP.T = length(结果[结果!=NA)]),comb = interaction(ID,Time))


I have a dataset which is in longformat in which Measurements (Time) are nested in Networkpartners (NP) which are nested in Persons (ID), here is an example of what it looks like (the real dataset has over thousands of rows):

ID  NP  Time Outcome
1   11  1    4
1   11  2    3
1   11  3    NA
1   12  1    2
1   12  2    3
1   12  3    3
2   21  1    2
2   21  2    NA
2   21  3    NA
2   22  1    4
2   22  2    4
2   22  3    4

Now I would like to create 3 new variables:

a) The Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) has Time 1

b) Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) at Time 2

c) Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) at Time 3

So I would like to create a dataset like this:

ID  NP  Time Outcome  NP.T1  NP.T2  NP.T3
1   11  1    4        2      2      1
1   11  2    3        2      2      1
1   11  3    NA       2      2      1
1   12  1    2        2      2      1
1   12  2    3        2      2      1
1   12  3    3        2      2      1
2   21  1    2        2      1      1
2   21  2    NA       2      1      1
2   21  3    NA       2      1      1
2   22  1    4        2      1      1
2   22  2    4        2      1      1
2   22  3    4        2      1      1

I would really appreciate your help.

解决方案

You can just create one variable rather than three. I am using ddply from plyr package for that.

mydata<-structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L), NP = c(11L, 11L, 11L, 12L, 12L, 12L, 21L, 21L, 21L, 
22L, 22L, 22L), Time = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 
1L, 2L, 3L), Outcome = c(4L, 3L, NA, 2L, 3L, 3L, 2L, NA, NA, 
4L, 4L, 4L)), .Names = c("ID", "NP", "Time", "Outcome"), class = "data.frame", row.names = c(NA, 
-12L))


    library(plyr)
    mydata1<-ddply(mydata,.(ID,Time),transform, NP.T=length(Outcome[which(Outcome !="NA")]))
>mydata1
   ID NP Time Outcome NP.T
1   1 11    1       4    2
2   1 12    1       2    2
3   1 11    2       3    2
4   1 12    2       3    2
5   1 11    3      NA    1
6   1 12    3       3    1
7   2 21    1       2    2
8   2 22    1       4    2
9   2 21    2      NA    1
10  2 22    2       4    1
11  2 21    3      NA    1
12  2 22    3       4    1

Updated: You can also use interaction to create the unique variable that combines ID and Time (comb)

mydata1<-ddply(mydata,.(ID,Time),transform, NP.T=length(Outcome[which(Outcome !="NA")]),comb=interaction(ID,Time))

这篇关于从data.frame创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆