如何为重复的测量数据创建缺失值? [英] How to create missing value for repeated measurement data?
问题描述
我有一个数据集,并不是每个受试者的观察结果都在相同的时间点观察到,但是我想将其转换为一个数据集,即每个人的观察结果都在相同的时间点观察到(这样我就可以在SAS proc traj中使用它).
I have a data set that not every subject’s observations were observed at the exact same time points, but I want to turn it in to a data set that every one’s observations were observed at the exact same time points (so that I can use it in SAS proc traj).
例如,假设我有数据集"m":
For example, suppose I have dataset "m":
id <- c(1,1,1,1,2,2,3,3,3)
age <- c(2,3,4,5,3,6,2,5,8)
IQ <- c(3,4,5,4,6,5,3,8,10)
m <- data.frame(id,age,IQ)
> m
id age IQ
1 1 2 3
2 1 3 4
3 1 4 5
4 1 5 4
5 2 3 6
6 2 6 5
7 3 2 3
8 3 5 8
9 3 8 10
> unique(age)
[1] 2 3 4 5 6 8
我想将m转换为m2.但是我只能手动完成.
I want to turn m to m2. But I can only do that manually.
id2 <- c(1,1,1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3)
age2 <- c(2,3,4,5,6,8,2,3,4,5,6,8,2,3,4,5,6,8)
IQ2 <- c(3,4,5,4,NA,NA,6,5,NA,NA,NA,NA,3,8,10,NA,NA,NA)
m2 <- data.frame(id2,age2,IQ2)
m2
> m2
id2 age2 IQ2
1 1 2 3
2 1 3 4
3 1 4 5
4 1 5 4
5 1 6 NA
6 1 8 NA
7 2 2 6
8 2 3 5
9 2 4 NA
10 2 5 NA
11 2 6 NA
12 2 8 NA
13 3 2 3
14 3 3 8
15 3 4 10
16 3 5 NA
17 3 6 NA
18 3 8 NA
有人知道更聪明的方法吗?
Does anyone know a smarter way to do this?
推荐答案
使用tidyr,这是一个衬板.您使用complete
函数,该函数使用传递给它的列的每种组合来创建行,并用NA填充其余行:
Using tidyr, this is a one liner. You use the complete
function, which creates rows with each combination of the columns passed to it, filling the rest of the rows with NA:
library(tidyr)
complete(m, id, age)
Source: local data frame [18 x 3]
id age IQ
(dbl) (dbl) (dbl)
1 1 2 3
2 1 3 4
3 1 4 5
4 1 5 4
5 1 6 NA
6 1 8 NA
7 2 2 NA
8 2 3 6
9 2 4 NA
10 2 5 NA
11 2 6 5
12 2 8 NA
13 3 2 3
14 3 3 NA
15 3 4 NA
16 3 5 8
17 3 6 NA
18 3 8 10
这篇关于如何为重复的测量数据创建缺失值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!