从data.frame创建新列 [英] Creating a new columns from a data.frame
问题描述
ID NP时间结果
1 11 1 4
1 11 2 3
1 11 3 NA
1 12 1 2
1 12 2 3
1 12 3 3
2 21 1 2
2 21 2 NA
2 21 3 NA
2 22 1 4
2 22 2 4
2 22 3 4
现在我想创建3个新变量:
a)网络伙伴的数量(在此测量结果中没有NA)a具体人(ID)具有时间1
b)网络伙伴的数量(在此测量结果中没有NA)时间2的特定人(ID) / p>
c)网络伙伴的数量(在此测量结果中没有NA)时间3的特定人员(ID) p>
所以我想创建一个这样的数据集:
ID NP时间结果NP.T1 NP.T2 NP.T3
1 11 1 4 2 2 1
1 11 2 3 2 2 1
1 11 3 NA 2 2 1
1 12 1 2 2 2 1
1 12 2 3 2 2 1
1 12 3 3 2 2 1
2 21 1 2 2 1 1
2 21 2 NA 2 1 1
2 21 3 NA 2 1 1
2 22 1 4 2 1 1
2 22 2 4 2 1 1
2 22 3 4 2 1 1
我非常感谢你的帮助。
您只需创建一个变量,而不是三个。我正在从
的plyr包中使用 ddply
。
mydata< -structure(list(ID = c(1L,1L,1L,1L,1L,1L,2L,2L ,2L,2L,
2L,2L),NP = c(11L,11L,11L,12L,12L,12L,21L,21L,21L,
22L,22L,22L),时间= (1L,2L,3L,1L,2L,3L,1L,2L,3L,
1L,2L,3L),Outcome = c(4L,3L,NA,2L,3L,3L,2L, NA,
4L,4L,4L)),.Names = c(ID,NP,时间,结果),class =data.frame,row.names = c NA,
-12L))
库(plyr)
mydata1< -ddply(mydata,。(ID,Time),transform,NP.T =长度(结果[结果!=NA)]))
> mydata1
ID NP时间结果NP.T
1 1 11 1 4 2
2 1 12 1 2 2
3 1 11 2 3 2
4 1 12 2 3 2
5 1 11 3 NA 1
6 1 12 3 3 1
7 2 21 1 2 2
8 2 22 1 4 2
9 2 21 2 NA 1
10 2 22 2 4 1
11 2 21 3 NA 1
12 2 22 3 4 1
更新:您还可以使用交互
创建唯一变量组合ID和时间(梳)
mydata1< -ddply(mydata,。(ID,Time),transform,NP.T = length(结果[结果!=NA)]),comb = interaction(ID,Time))
I have a dataset which is in longformat in which Measurements (Time) are nested in Networkpartners (NP) which are nested in Persons (ID), here is an example of what it looks like (the real dataset has over thousands of rows):
ID NP Time Outcome
1 11 1 4
1 11 2 3
1 11 3 NA
1 12 1 2
1 12 2 3
1 12 3 3
2 21 1 2
2 21 2 NA
2 21 3 NA
2 22 1 4
2 22 2 4
2 22 3 4
Now I would like to create 3 new variables:
a) The Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) has Time 1
b) Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) at Time 2
c) Number of Networkpartners (who have no NA in the outcome at this measurement) a specific person (ID) at Time 3
So I would like to create a dataset like this:
ID NP Time Outcome NP.T1 NP.T2 NP.T3
1 11 1 4 2 2 1
1 11 2 3 2 2 1
1 11 3 NA 2 2 1
1 12 1 2 2 2 1
1 12 2 3 2 2 1
1 12 3 3 2 2 1
2 21 1 2 2 1 1
2 21 2 NA 2 1 1
2 21 3 NA 2 1 1
2 22 1 4 2 1 1
2 22 2 4 2 1 1
2 22 3 4 2 1 1
I would really appreciate your help.
You can just create one variable rather than three. I am using ddply
from plyr package for
that.
mydata<-structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L), NP = c(11L, 11L, 11L, 12L, 12L, 12L, 21L, 21L, 21L,
22L, 22L, 22L), Time = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L), Outcome = c(4L, 3L, NA, 2L, 3L, 3L, 2L, NA, NA,
4L, 4L, 4L)), .Names = c("ID", "NP", "Time", "Outcome"), class = "data.frame", row.names = c(NA,
-12L))
library(plyr)
mydata1<-ddply(mydata,.(ID,Time),transform, NP.T=length(Outcome[which(Outcome !="NA")]))
>mydata1
ID NP Time Outcome NP.T
1 1 11 1 4 2
2 1 12 1 2 2
3 1 11 2 3 2
4 1 12 2 3 2
5 1 11 3 NA 1
6 1 12 3 3 1
7 2 21 1 2 2
8 2 22 1 4 2
9 2 21 2 NA 1
10 2 22 2 4 1
11 2 21 3 NA 1
12 2 22 3 4 1
Updated: You can also use interaction
to create the unique variable that combines ID and Time (comb)
mydata1<-ddply(mydata,.(ID,Time),transform, NP.T=length(Outcome[which(Outcome !="NA")]),comb=interaction(ID,Time))
这篇关于从data.frame创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!