选择第一个观察数据并利用突变 [英] Select first observed data and utilize mutate
问题描述
我遇到了一个数据问题,在这里我想要首先观察到 ob
得分得分
每个个人 id
,并从上次观察到的分数中减去
。
要求第一次观察减去最后一次观察的问题是有时第一次观察数据丢失。
有没有要求每个人的第一个观察到的分数,因此跳过任何丢失的数据?
我建立了以下df来说明我的问题。
帮助< - data.frame(id = c(5,5,5,5,5,12,12,12,17,17,20,20,20),
ob = c(1,2, 3,4,5,1,2,3,1,2,1,2,3),
得分= c(NA,2,3,4,3,7,3,4,3,4 ,NA,1,4))
id ob score
1 5 1 NA
2 5 2 2
3 5 3 3
4 5 4 4
5 5 5 3
6 12 1 7
7 12 2 3
8 12 3 4
9 17 1 3
10 17 2 4
11 20 1 NA
12 20 2 1
13 20 3 4
我希望运行的是代码,将给我...
id ob score es
1 5 1 NA -1
2 5 2 2 -1
3 5 3 3 -1
4 5 4 4 -1
5 5 5 3 -1
6 12 1 7 3
7 12 2 3 3
8 12 3 4 3
9 17 1 3 -1
10 17 2 4 -1
11 20 1 NA -3
12 20 2 1 -3
13 20 3 4 -3
我正试图从dplyr和我明白使用'group_by'命令,但是,不知道如何仅选择第一个观察到的分数,然后突变创建 es
。
我将使用 first()
和 last() code>(均为
dplyr
函数)和 na.omit()
(从默认stats包中。
首先,我将确保您的分数列是具有适当NA值的数字列(不在您的示例中的字符串)
help< - data.frame(id = c(5,5,5,5,5,12,12,12,17,17,20,20, 20),
ob = c(1,2,3,4,5,1,2,3,1,2,1,2,3),
score = c(NA,2, 3,4,3,7,3,4,3,4,NA,1,4))
然后你可以做
library(dplyr)
pre>
help%>%group_by(id)%> %安排(ob)%>%
mutate(es = first(na.omit(score)-last(na.omit(score))))
I am running into an issue with my data where I want to take the first observed
ob
scorescore
for each individualid
and subtract that from that last observedscore
.The problem with asking for the first observation minus the last observation is that sometimes the first observation data is missing.
Is there anyway to ask for the first observed score for each individual, thus skipping any missing data?
I built the below df to illustrate my problem.
help <- data.frame(id = c(5,5,5,5,5,12,12,12,17,17,20,20,20), ob = c(1,2,3,4,5,1,2,3,1,2,1,2,3), score = c(NA, 2, 3, 4, 3, 7, 3, 4, 3, 4, NA, 1, 4)) id ob score 1 5 1 NA 2 5 2 2 3 5 3 3 4 5 4 4 5 5 5 3 6 12 1 7 7 12 2 3 8 12 3 4 9 17 1 3 10 17 2 4 11 20 1 NA 12 20 2 1 13 20 3 4
And what I am hoping to run is code that will give me...
id ob score es 1 5 1 NA -1 2 5 2 2 -1 3 5 3 3 -1 4 5 4 4 -1 5 5 5 3 -1 6 12 1 7 3 7 12 2 3 3 8 12 3 4 3 9 17 1 3 -1 10 17 2 4 -1 11 20 1 NA -3 12 20 2 1 -3 13 20 3 4 -3
I am attempting to work out of dplyr and I understand the use of the 'group_by' command, however, not sure how to 'select' only first observed scores and then mutate to create
es
.解决方案I would use
first()
andlast()
(bothdplyr
function) andna.omit()
(from the default stats package.First, I would make sure your score column was a numberic column with proper NA values (not strings as in your example)
help <- data.frame(id = c(5,5,5,5,5,12,12,12,17,17,20,20,20), ob = c(1,2,3,4,5,1,2,3,1,2,1,2,3), score = c(NA, 2, 3, 4, 3, 7, 3, 4, 3, 4, NA, 1, 4))
then you can do
library(dplyr) help %>% group_by(id) %>% arrange(ob) %>% mutate(es=first(na.omit(score)-last(na.omit(score))))
这篇关于选择第一个观察数据并利用突变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!