传递带有名称的向量以进行变异以创建多个新列 [英] Pass a vector with names to mutate to create multiple new columns

查看:74
本文介绍了传递带有名称的向量以进行变异以创建多个新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试使用包含正确答案的向量重新编码答案。我做了一个for循环,使用带有新列可能名称的向量在每个循环处创建一个新列(带有编码答案)。

I'm trying to recode answers using a vector that contains the correct answers. I made a for loop that create a new column (with the coded answer) at each loop using a vector with the possible names for the new columns.

但是,似乎突变不会接收带有名称的向量。我尝试了一些不同的向量和某些paste0()组合,但似乎没有任何效果。

However, it seems that mutate does not receive vectors with names. I've tried some different vectors and some paste0() combinations but nothing seem to work.

这是我的可复制代码:

library(dplyr)
library(tibble)

correct = c(4, 5, 2, 2, 2, 3, 3, 5, 4, 5, 2, 1, 3, 4, 2, 2, 2, 4, 3, 1, 1, 5, 4, 1, 3, 2)

sub1 = c(3, 5, 1, 5, 4, 3, 2, 5, 4, 3, 4, 4, 4, 1, 5, 1, 4, 3, 3, 4, 3, 2, 4, 2, 3, 4)

df = t(data.frame(sub1))
colnames(df) = paste0("P", 1:26)

new_names = paste0("P", 1:26, "_coded")

for(i in 1:26){


  df = as.tibble(df) %>% 
    mutate(new_names = case_when(.[i] == correct[i] ~ 1, 
                     .[i] != correct[i] ~ 0, 
                     T ~ 9999999))

  print(df) # to know what's going on.

}

此外,我知道.dot可以在向量(我认为),但在mutate()中时,我不太了解如何与case_一起使用。

Also, I know that .dots can receive names in a vector (I think), but I don't quite understand how to use it with case_when inside mutate().

其他使用重新编码的值创建新列的方法是也欢迎

Others ways to create new columns with the recoded value are welcome also

更新
我的预期输出将是带有26个新列的原始数据框,P1_COD:P26_COD并带有可能的值1(如果正确)和0(如果不正确)。

UPDATE: My expected output would be the original data frame with 26 new columns, P1_COD:P26_COD with possible values 1 (if correct) and 0 (if incorrect).

像这样的事情(我刚刚创建了四个带有1和0的列)。

Something like this (I just created four columns with 1s and 0s as an example).

df %>% 
  mutate(P1_COD = 1,
         P2_COD = 0,
         P3_COD = 1,
         P4_COD = 1)


推荐答案

格式不是 dplyr 将能最好地处理的格式。我建议将数据重组为纵向格式,然后case_when变得琐碎且不需要for循环。

The data is not in a format that dplyr will handle best. I would suggest restructuring your data to longitudinal format, and then the case_when becomes trivial and no for loop is required.

有关提迪尔的其他文档,请参见 = http://tidyr.tidyverse.org/articles/tidy-data.html rel = nofollow noreferrer> tidyverse.org文档

see other documentation for tidyr regarding data format at tidyverse.org documentation

这是包括您的样本数据的纵向格式的示例。我还添加了其他一些具有随机答案的主题。

Here is an example of the "longitudinal" format including your sample data. I also added a couple of other subjects with random answers.

library(tidyverse)
responses <- data_frame(
  subject = rep(1:3, each = 26),
  qNum = rep(1:26, 3),
  response = c(sub1, 
               sample(5, 26, replace = T),
               sample(5, 26, replace = T)))

可以创建然后合并答案:

The answers can be created and then merged:

answers <- data_frame(
  qNum = 1:26,
  answer = correct)
df <- left_join(responses, answers)

接下来,使用 dplyr :: case_when

df <- df %>% mutate(score = case_when(response == answer ~ 1,
                                TRUE ~ 0))

注意: TRUE〜0 最初可能会造成混淆。如果第一个条件为FALSE,它将告诉如何处理剩余的值。结果df / tibble:

note: the TRUE ~ 0 may be confusing at first. It tells what to do with the remaining values, if the first condition is FALSE. The resulting df/tibble:

# A tibble: 26 x 5
   subject  qNum response answer score
     <dbl> <int>    <dbl>  <dbl> <dbl>
 1       1     1        3      4     0
 2       1     2        5      5     1
 3       1     3        1      2     0
 4       1     4        5      2     0
 5       1     5        4      2     0
 6       1     6        3      3     1
 7       1     7        2      3     0
 8       1     8        5      5     1
 9       1     9        4      4     1
10       1    10        3      5     0
# ... with 16 more rows

如果要将其转换为宽格式,请使用 tidyr :: spread

If you want to convert this to "wide" format, use tidyr::spread:

df %>%
  select(-response, -answer) %>% 
  spread(qNum, score, sep = ".")
# A tibble: 3 x 27
  subject qNum.1 qNum.2 qNum.3 qNum.4 qNum.5 qNum.6 qNum.7 qNum.8 qNum.9 qNum.10
*   <int>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl>
1       1      0      1      0      0      0      1      0      1      1       0
2       2      0      0      0      0      1      0      0      0      0       0
3       3      0      0      0      0      1      0      0      0      0       0

这篇关于传递带有名称的向量以进行变异以创建多个新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆