用不同类型的缺失替换一系列变量中的NA [英] Replace NA in a series of variables with different types of missing

查看：49 发布时间：2021/4/28 20:01:33 r database dataframe dplyr missing-data

本文介绍了用不同类型的缺失替换一系列变量中的NA的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的数据.

# A tibble: 10 x 6
     id  main   s_0   s_1   s_2   s_3  
   <dbl> <fct> <fct> <fct> <fct> <fct>
 1    1     5    75    A     4     110  
 2    2    NA    NA    NA    NA    NA  
 3    3    11    13    NA    7     769  
 4    4    NA    NA    NA    NA    NA   
 5    5    11    NA    NA    NA    835  
 6    6    13    39    NA    4     NA   
 7    7    NA    NA    NA    NA    NA  
 8    8    19    42    D     6     654   
 9    9    20    4     NA    7     577  
10   10    NA    NA    NA    NA    NA

如您所见，main列指示其他列(s_0:s_4)中的行是否回答了问题.其他2、4、7和10人没有资格参加比赛，但是其他参与者可以回答或错过(s_0:s_4).因此，我混合使用了NA，我想使用一个可以识别丢失来源的代码.我正在使用的代码混合了各种丢失的内容:

As you can see, the column main indicates that rows in the other columns (s_0: s_4) answered the questions or not. Ids 2,4,7 and 10 were not eligible for the rest, however, other participants can answer or miss (s_0:s_4). So I have a mix of NAs, I want to use a code that can identify the source of missing. The code that I am using mix all kind of missing :

library(dplyr)
library(forcats)

# Make sample data vars factors
dat <- dat %>%
  mutate(across(starts_with("s_"), as.factor))

# Add 'No' as factor level
dat %>%
  mutate(across(starts_with("s_"), fct_explicit_na, "No"))

虽然我想要这样的东西:

While I want to have something like this:

# A tibble: 10 x 6
     id  main   s_0   s_1   s_2   s_3  
   <dbl> <fct> <fct> <fct> <fct> <fct>
 1    1     5    75    A     4     110  
 2    2    NO1   NO1   NO1   NO1   NO1  
 3    3    11    13    NO    7     769  
 4    4    NO1   NO1   NO1   NO1   NO1   
 5    5    11    NO    NO    NO    835  
 6    6    13    39    NO    4     NA   
 7    7    NO1   NO1   NO1   NO1   NO1 
 8    8    19    42    D     6     654   
 9    9    20    4     NO    7     577  
10   10    NO1   NO1   NO1   NO1   NO1

推荐答案

尝试:

#Convert columns to characters
df[-1] <- lapply(df[-1], as.character)
#Find index of `NA` value in `main` column
inds <- is.na(df$main)
#Change all the columns to "NO1" in row inds
df[inds, -1] <- 'NO1'
#Change remaining NA values to "NO"
df[is.na(df)] <- 'NO'
df

#   id main s_0 s_1 s_2 s_3
#1   1    5  75   A   4 110
#2   2  NO1 NO1 NO1 NO1 NO1
#3   3   11  13  NO   7 769
#4   4  NO1 NO1 NO1 NO1 NO1
#5   5   11  NO  NO  NO 835
#6   6   13  39  NO   4  NO
#7   7  NO1 NO1 NO1 NO1 NO1
#8   8   19  42   D   6 654
#9   9   20   4  NO   7 577
#10 10  NO1 NO1 NO1 NO1 NO1

这篇关于用不同类型的缺失替换一系列变量中的NA的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

用不同类型的缺失替换一系列变量中的NA [英] Replace NA in a series of variables with different types of missing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

用不同类型的缺失替换一系列变量中的NA [英] Replace NA in a series of variables with different types of missing

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭