如何使用“包含"有条件地改变多列和“ifelse"? [英] How to conditionally mutate multiple columns using "contains" and "ifelse"?
问题描述
我想改变包含字符串account"的多列.具体来说,我希望这些列在满足某个条件时取NA",不满足条件时取另一个值.下面我展示了我在 here 和 此处.至此,未果.仍在尝试,但任何帮助将不胜感激.
我的数据
df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16,16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25, 28.25,28.25, 28.25), mean_account_0 = c(31.174, 30.1922101449275, 30.1922101449275,33.3055555555556, 31.174, 33.3055555555556),median_account_0 = c(2.1,3.8, 24.2, 24.2, 24.2, 24.2), low_account.1 = c(1, 1, 0.5, 0.5, 0.5,0.5), high_account.1 = c(16, 16, 56, 56, 56, 56), row.names = c("A001", "A002", "A003", "A004", "A005", "A006"")))))dflow_account high_account mid_account_0 mean_account_0 medium_account_0 low_account.1 high_account.1 row.names1 1.0 16 8.50 31.17400 2.1 1.0 16 A0012 1.0 16 8.50 30.19221 3.8 1.0 16 A0023 0.5 56 28.25 30.19221 24.2 0.5 56 A0034 0.5 56 28.25 33.30556 24.2 0.5 56 A0045 0.5 56 28.25 31.17400 24.2 0.5 56 A0056 0.5 56 28.25 33.30556 24.2 0.5 56 A006
我的尝试
sample_data<-df%>% mutate_at(select(contains("account") , ifelse(. <= df$low_account& >= df$high_account, NA, .)))
<块引用>
错误:没有注册 tidyselect 变量调用 rlang::last_error()
以查看回溯
预期输出
dflow_account high_account mid_account_0 mean_account_0 medium_account_0 low_account.1 high_account.1 row.names1 1.0 16 8.50 不适用 2.1 1.0 16 A0012 1.0 16 8.50 不适用 3.8 1.0 16 A0023 0.5 56 28.25 30.19221 24.2 0.5 56 A0034 0.5 56 28.25 33.30556 24.2 0.5 56 A0045 0.5 56 28.25 31.17400 24.2 0.5 56 A0056 0.5 56 28.25 33.30556 24.2 0.5 56 A006
vars(contains('account'))
的问题在于它匹配子字符串 'account' 所在的所有列存在并且当我们进行逻辑比较时,'low_account' 列被转换为 NA
,因为它肯定低于或等于 'low_account',因此只有 NA 替换的列可用.因此,相反,我们可以获取感兴趣的列 'mid'、'median'、'mean' 列,然后执行 replace
图书馆(tidyverse)df%>%mutate_at(vars(matches("(mid|mean|median)_account")),~ 替换(., .<= low_account | .>= high_account, NA))# low_account high_account mid_account_0 mean_account_0 medium_account_0 low_account.1 high_account.1 row.names#1 1.0 16 8.50 不适用 2.1 1.0 16 A001#2 1.0 16 8.50 不适用 3.8 1.0 16 A002#3 0.5 56 28.25 30.19221 24.2 0.5 56 A003#4 0.5 56 28.25 33.30556 24.2 0.5 56 A004#5 0.5 56 28.25 31.17400 24.2 0.5 56 A005#6 0.5 56 28.25 33.30556 24.2 0.5 56 A006
I want to mutate multiple columns containing the string "account". Specifically, I want these columns to take "NA" when a certain condition is met, and another value when the condition is not met. Below I present my attempt inspired on here and here. So far, unsuccessful. Still trying, nevertheless any help would be much appreciated.
My data
df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16,
16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25, 28.25,
28.25, 28.25), mean_account_0 = c(31.174, 30.1922101449275, 30.1922101449275,
33.3055555555556, 31.174, 33.3055555555556), median_account_0 = c(2.1,
3.8, 24.2, 24.2, 24.2, 24.2), low_account.1 = c(1, 1, 0.5, 0.5, 0.5,
0.5), high_account.1 = c(16, 16, 56, 56, 56, 56), row.names = c("A001", "A002", "A003", "A004", "A005", "A006"))))
df
low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
1 1.0 16 8.50 31.17400 2.1 1.0 16 A001
2 1.0 16 8.50 30.19221 3.8 1.0 16 A002
3 0.5 56 28.25 30.19221 24.2 0.5 56 A003
4 0.5 56 28.25 33.30556 24.2 0.5 56 A004
5 0.5 56 28.25 31.17400 24.2 0.5 56 A005
6 0.5 56 28.25 33.30556 24.2 0.5 56 A006
My attempt
sample_data<-df%>% mutate_at(select(contains("account") , ifelse(. <= df$low_account& >= df$high_account, NA, .)))
Error: No tidyselect variables were registered Call
rlang::last_error()
to see a backtrace
Expected output
df
low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
1 1.0 16 8.50 NA 2.1 1.0 16 A001
2 1.0 16 8.50 NA 3.8 1.0 16 A002
3 0.5 56 28.25 30.19221 24.2 0.5 56 A003
4 0.5 56 28.25 33.30556 24.2 0.5 56 A004
5 0.5 56 28.25 31.17400 24.2 0.5 56 A005
6 0.5 56 28.25 33.30556 24.2 0.5 56 A006
The issue with the vars(contains('account'))
is that it matches all the columns where the substring 'account' is present and when we do the logical comparison, the 'low_account' column gets converted to NA
because it is definitely lower or equal to 'low_account', thus only that NA replaced column is available. So, instead, we can get the columns of interest 'mid', 'median', 'mean' columns and then do the replace
library(tidyverse)
df %>%
mutate_at(vars(matches("(mid|mean|median)_account")),
~ replace(., .<= low_account | .>= high_account, NA))
# low_account high_account mid_account_0 mean_account_0 median_account_0 low_account.1 high_account.1 row.names
#1 1.0 16 8.50 NA 2.1 1.0 16 A001
#2 1.0 16 8.50 NA 3.8 1.0 16 A002
#3 0.5 56 28.25 30.19221 24.2 0.5 56 A003
#4 0.5 56 28.25 33.30556 24.2 0.5 56 A004
#5 0.5 56 28.25 31.17400 24.2 0.5 56 A005
#6 0.5 56 28.25 33.30556 24.2 0.5 56 A006
这篇关于如何使用“包含"有条件地改变多列和“ifelse"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!