在 R 中跟踪序列的变化 [英] Tracking the change in a sequence in R

查看:44
本文介绍了在 R 中跟踪序列的变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里问了类似的问题,但函数出现了一些问题,我会尝试我的最好尽可能问清楚.

I asked something similar here but function gave some issue, I will try my best to ask it as clear as I can.

我有一个如下所示的示例数据集:

I have a sample dataset looks like this below:

 id <-       c(1,1,1, 2,2,2, 3,3, 4,4, 5,5,5,5, 6,6,6, 7, 8,8)
    item.id <-  c(1,1,2, 1,1,1 ,1,1, 1,2, 1,2,2,2, 1,1,1, 1, 1,2)
    sequence <- c(1,2,1, 1,2,3, 1,2, 1,1, 1,1,2,3, 1,2,3, 1, 1,1)
    score <-    c(0,0,0, 0,0,1, 2,0, 1,1, 1,0,1,1, 0,0,0, 1, 0,2)

    data <- data.frame("id"=id, "item.id"=item.id, "sequence"=sequence, "score"=score)
> data
    id item.id sequence score
1   1       1        1     0
2   1       1        2     0
3   1       2        1     0
4   2       1        1     0
5   2       1        2     0
6   2       1        3     1
7   3       1        1     2
8   3       1        2     0
9   4       1        1     1
10  4       2        1     1
11  5       1        1     1
12  5       2        1     0
13  5       2        2     1
14  5       2        3     1
15  6       1        1     0
16  6       1        2     0
17  6       1        3     0
18  7       1        1     1
19  8       1        1     0
20  8       2        1     2

id 代表每个学生,item.id 代表学生回答的问题,sequence 是每个item 的尝试次数.idscore 是每次尝试的分数,取 0,1 或 2.学生可以更改他们的答案.

id represents for each student, item.id represents the questions students take, sequence is the attempt number for each item.id, and score is the score for each attempt, taking 0,1, or 2. Students can change their answers.

对于每个 id 中的 item.id,我想通过查看最后两个序列(更改):

For item.id within each id, I want to create a variable (status) by looking at the last two sequences (changes):

a) assign "WW" for those who changed from wrong to wrong,
b) assign "WR" for those who changed from wrong to right,
c) assign "RW" for those who changed from right to wrong, and
d) assign "RR" for those who changed from right to right.

从 0 到 1 或 0 到 2 的分数变化被认为是正确的(正确的)变化,而,分数从 1 变为 0 或 2 变为 0 被视为不正确(错误)的变化.

score change from 0 to 1 or 0 to 2 considered correct (right) change while, score change from 1 to 0 or 2 to 0 considered incorrect (wrong) change.

如果 item.id 只有一次尝试,如 id=7,则 status应该是 "one.right".如果 score0,那么它应该是 "one.wrong".同时,当12时,score被认为是rightscore是当它是 0 时被认为是错误的.

If there is only one attempt for item.id as in id=7, then the status should be "one.right". If the score was 0, then it should be "one.wrong". Meanwhile, score is considered right when it is 1 or 2, score is considered wrong when it is 0.

所需的输出是带有案例的:

the desired output would be with cases:

 > desired
  id item.id    status
  1   1       1        WW
  2   1       2 one.wrong
  3   2       1        WR
  4   3       1        RW
  5   4       1 one.right
  6   4       2 one.right
  7   5       1 one.right
  8   5       2        RR
  9   6       1        WW
  10  7       1 one.right
  11  8       1 one.wrong
  12  8       2 one.right

有什么意见吗?谢谢!

推荐答案

library(dplyr)
library(purrr)
library(forcats)

data %>% 
  mutate(status = ifelse(score > 0, "R", "W")) %>% 
  group_by(id, item.id) %>% 
  filter(sequence == n() - 1 | sequence == n()) %>%  
  summarise(status = paste(status, collapse = "")) %>% 
  ungroup() %>% 
  mutate(status = fct_recode(status, "one.wrong" = "W", "one.right" = "R"))

我相信这几乎是自我描述,但我会分解它:

I believe it's pretty much self-describing, but I'll break it down:

1) 在第一个 mutate 中,我们从 score 创建了一个 W/R 列:0 给出了 'W',上面的所有内容都给出了 'R'.

1) In the first mutate we create a W/R column from score: 0 gives 'W', everything above gives 'R'.

2) 然后我们按 iditem.id 对数据进行分组,并选择最后两行,如果组中只有一行,则保留该行(过滤器).

2) Then we group the data by id, item.id and select last two rows or just keep the row if it's only one in the group (filter).

3) 之后,我们将这个 status 列压缩到每个组中的一个字符串中 (summarize).所以可能的值是:'W'、'R'、'WW'、'WR'、'RW'、'RR'.

3) After that we squeeze this status column into one string in each group (summarize). So the possible values are: 'W', 'R', 'WW', 'WR', 'RW', 'RR'.

4) 剩下要做的最后一件事是使用 forcats::fct_recode 将W"重新编码为one.wrong",将R"重新编码为one.right".

4) The last thing that is left to do is to recode 'W' to 'one.wrong' and 'R' to 'one.right', using forcats::fct_recode.

这篇关于在 R 中跟踪序列的变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆