如何将cross()函数与mutate()和case_when()结合以根据条件对多列中的值进行突变? [英] How to combine the across () function with mutate () and case_when () to mutate values in multiple columns according to a condition?

查看:80
本文介绍了如何将cross()函数与mutate()和case_when()结合以根据条件对多列中的值进行突变?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个人口统计数据集,其中包括一个家庭的年龄.这是通过调查收集的,允许参与者拒绝提供年龄.

I have demographic data set, which includes the age of people in a household. This is collected via a survey and participants are allowed to refuse providing their age.

结果是一个数据集,该数据集每行有一个家庭(每个家庭都有一个家庭ID码),并且各个家庭特征(例如列中的年龄)也是如此.拒绝的响应编码为"R",您可以使用以下代码重新创建示例:

The result is a data set with one household per row (each with a household ID code), and various household characteristics such as age in the columns. Refused responses as coded as "R", and you could re-create a sample using the code below:

df <- list(Household_ID = c("1A", "1B", "1C", "1D", "1E"),
           AGE1 = c("25", "47", "39", "50", "R"),
           AGE2 = c("66", "23", "71", "R", "16"),
           AGE3 = c("28", "17", "R", "R", "80"),
           AGE4 = c("81", "22", "48", "59", "R"))

df <- as_tibble(df)

> df
# A tibble: 5 x 5
  Household_ID AGE1  AGE2  AGE3  AGE4 
  <chr>        <chr> <chr> <chr> <chr>
1 1A           25    66    28    81   
2 1B           47    23    17    22   
3 1C           39    71    R     48   
4 1D           50    R     R     59   
5 1E           R     16    80    R 

出于我们的意图和目的,我们将"R"重新编码为到"-9"这样我们就可以随后将AGE列的格式转换为整数,并进行分析.我们通常在其他软件中执行此操作,而我的目标是在R中复制此过程.

For our intents and purposes we re-code the "R" to "-9" so that we can subsequently convert the format of the AGE columns to integer, and carry out analysis. We usually do this in another software and my objective is to replicate this process in R.

我设法用以下代码做到了这一点:

I have managed to do this with the following code:

df <- df %>% mutate(AGE1 = case_when(AGE1 == "R" ~ "-9", TRUE ~ as.character(AGE1)))
df <- df %>% mutate(AGE2 = case_when(AGE2 == "R" ~ "-9", TRUE ~ as.character(AGE2)))
df <- df %>% mutate(AGE3 = case_when(AGE3 == "R" ~ "-9", TRUE ~ as.character(AGE3)))
df <- df %>% mutate(AGE4 = case_when(AGE4 == "R" ~ "-9", TRUE ~ as.character(AGE4)))

鉴于这感觉很笨拙,我尝试使用mutate_if等找到解决方案,但读到它们已被cross()取代.因此,我尝试使用cross()复制此操作:

Given that this feels clumsy, I tried to find a solution using mutate_if etc. but read that these have been superseded by across(). Hence, I tried to replicate this operation using across():

df <- df %>%
  mutate(across(AGE1:AEG4),
          ~ (case_when(. == "R" ~ "-9")))

但是出现以下错误:

Error: Problem with `mutate()` input `..2`.
x Input `..2` must be a vector, not a `formula` object.
i Input `..2` is `~(case_when(. == "R" ~ "-9"))`.

曾经为此苦苦挣扎,现在搜索了一段时间,但无法弄清我所缺少的东西.非常感谢您提供一些有关如何使它正常工作的意见,谢谢.

Been wrestling with this and googling for a while now but can't figure out what I am missing. Would really appreciate some input on how to get this working, please and thank you.

已解决!

df <- df %>%
  mutate(across(AGE1:AGE4, ~ (case_when(.x == "R" ~ "-9", TRUE ~ as.character(.x)))))

推荐答案

也许这与亲爱的@TarJae的解释并没有太大区别:

Or maybe this one which is not much difference from dear @TarJae's interpretation:

library(dplyr)
library(stringr)


df %>%
  mutate(across(AGE1:AGE4, ~ str_replace(., "R", "-9")),
         across(AGE1:AGE4, as.integer))

# A tibble: 5 x 5
  Household_ID  AGE1  AGE2  AGE3  AGE4
  <chr>        <int> <int> <int> <int>
1 1A              25    66    28    81
2 1B              47    23    17    22
3 1C              39    71    -9    48
4 1D              50    -9    -9    59
5 1E              -9    16    80    -9

数据:

df <- list(Household_ID = c("1A", "1B", "1C", "1D", "1E"),
           AGE1 = c("25", "47", "39", "50", "R"),
           AGE2 = c("66", "23", "71", "R", "16"),
           AGE3 = c("28", "17", "R", "R", "80"),
           AGE4 = c("81", "22", "48", "59", "R"))

df <- as_tibble(df)

这篇关于如何将cross()函数与mutate()和case_when()结合以根据条件对多列中的值进行突变?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆