在 R 中跟踪序列的变化 [英] Tracking the change in a sequence in R
问题描述
我在这里问了类似的问题,但函数出现了一些问题,我会尝试我的最好尽可能问清楚.
I asked something similar here but function gave some issue, I will try my best to ask it as clear as I can.
我有一个如下所示的示例数据集:
I have a sample dataset looks like this below:
id <- c(1,1,1, 2,2,2, 3,3, 4,4, 5,5,5,5, 6,6,6, 7, 8,8)
item.id <- c(1,1,2, 1,1,1 ,1,1, 1,2, 1,2,2,2, 1,1,1, 1, 1,2)
sequence <- c(1,2,1, 1,2,3, 1,2, 1,1, 1,1,2,3, 1,2,3, 1, 1,1)
score <- c(0,0,0, 0,0,1, 2,0, 1,1, 1,0,1,1, 0,0,0, 1, 0,2)
data <- data.frame("id"=id, "item.id"=item.id, "sequence"=sequence, "score"=score)
> data
id item.id sequence score
1 1 1 1 0
2 1 1 2 0
3 1 2 1 0
4 2 1 1 0
5 2 1 2 0
6 2 1 3 1
7 3 1 1 2
8 3 1 2 0
9 4 1 1 1
10 4 2 1 1
11 5 1 1 1
12 5 2 1 0
13 5 2 2 1
14 5 2 3 1
15 6 1 1 0
16 6 1 2 0
17 6 1 3 0
18 7 1 1 1
19 8 1 1 0
20 8 2 1 2
id
代表每个学生,item.id
代表学生回答的问题,sequence
是每个item 的尝试次数.id
和 score
是每次尝试的分数,取 0,1 或 2.学生可以更改他们的答案.
id
represents for each student, item.id
represents the questions students take, sequence
is the attempt number for each item.id
, and score
is the score for each attempt, taking 0,1, or 2. Students can change their answers.
对于每个 id
中的 item.id
,我想通过查看最后两个序列(更改):
For item.id
within each id
, I want to create a variable (status
) by looking at the last two sequences (changes):
a) assign "WW" for those who changed from wrong to wrong,
b) assign "WR" for those who changed from wrong to right,
c) assign "RW" for those who changed from right to wrong, and
d) assign "RR" for those who changed from right to right.
从 0 到 1 或 0 到 2 的分数变化被认为是正确的(正确的)变化,而,分数从 1 变为 0 或 2 变为 0 被视为不正确(错误)的变化.
score change from 0 to 1 or 0 to 2 considered correct (right) change while, score change from 1 to 0 or 2 to 0 considered incorrect (wrong) change.
如果 item.id
只有一次尝试,如 id
=7
,则 status
应该是 "one.right"
.如果 score
是 0
,那么它应该是 "one.wrong"
.同时,当1
或2
时,score
被认为是right
,score
是当它是 0
时被认为是错误的.
If there is only one attempt for item.id
as in id
=7
, then the status
should be "one.right"
. If the score
was 0
, then it should be "one.wrong"
. Meanwhile, score
is considered right
when it is 1
or 2
, score
is considered wrong when it is 0
.
所需的输出是带有案例的:
the desired output would be with cases:
> desired
id item.id status
1 1 1 WW
2 1 2 one.wrong
3 2 1 WR
4 3 1 RW
5 4 1 one.right
6 4 2 one.right
7 5 1 one.right
8 5 2 RR
9 6 1 WW
10 7 1 one.right
11 8 1 one.wrong
12 8 2 one.right
有什么意见吗?谢谢!
推荐答案
library(dplyr)
library(purrr)
library(forcats)
data %>%
mutate(status = ifelse(score > 0, "R", "W")) %>%
group_by(id, item.id) %>%
filter(sequence == n() - 1 | sequence == n()) %>%
summarise(status = paste(status, collapse = "")) %>%
ungroup() %>%
mutate(status = fct_recode(status, "one.wrong" = "W", "one.right" = "R"))
我相信这几乎是自我描述,但我会分解它:
I believe it's pretty much self-describing, but I'll break it down:
1) 在第一个 mutate
中,我们从 score
创建了一个 W/R 列:0 给出了 'W',上面的所有内容都给出了 'R'.
1) In the first mutate
we create a W/R column from score
: 0 gives 'W', everything above gives 'R'.
2) 然后我们按 id
、item.id
对数据进行分组,并选择最后两行,如果组中只有一行,则保留该行(过滤器
).
2) Then we group the data by id
, item.id
and select last two rows or just keep the row if it's only one in the group (filter
).
3) 之后,我们将这个 status
列压缩到每个组中的一个字符串中 (summarize
).所以可能的值是:'W'、'R'、'WW'、'WR'、'RW'、'RR'.
3) After that we squeeze this status
column into one string in each group (summarize
). So the possible values are: 'W', 'R', 'WW', 'WR', 'RW', 'RR'.
4) 剩下要做的最后一件事是使用 forcats::fct_recode
将W"重新编码为one.wrong",将R"重新编码为one.right".
4) The last thing that is left to do is to recode 'W' to 'one.wrong' and 'R' to 'one.right', using forcats::fct_recode
.
这篇关于在 R 中跟踪序列的变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!