重复时如何根据优先级重新分类/替换值 [英] How to reclassify/replace values based on priority when there are repeats
问题描述
我有一个 df,其中 value
表示 drug
的状态:
I have a df where value
indicates the status of a drug
:
g1 = data.frame (
drug = c('a','a','a','d','d'),
value = c('fda','trial','case','case','pre')
)
drug value
1 a fda
2 a trial
3 a case
4 d case
5 d pre
因此,对于药物,我想根据 value
的以下优先级顺序替换任何重复的 drug
:
So for drugs, I want to replace any repeat drug
based on the following order-of-priority for value
:
fda > trial > case > pre
例如,如果药物 d 是case"和pre",所有发生 d 的事件都将被重新分类为case".决赛桌应该是这样的.
So for example if drug d is "case" as well as "pre", all incidence of d will be reclassify as "case". The final table should look like this.
drug value
1 a fda
2 a fda
3 a fda
4 d case
5 d case
如何做到这一点而不必遍历每种药物并先确定优先级然后替换?
How to do this without having to loop through each drug and figuring out the precedence first then replacing?
推荐答案
由于这是一个序数变量,你可以将 g1$value
设为一个 ordered
因子作为对应的<代码>类代码>.然后你可以像使用数字一样使用 min
和 max
之类的函数:
Since this is an ordinal variable, you can make g1$value
an ordered
factor as the corresponding class
. Then you can use functions like min
and max
like you would a numeric:
g1$value <- ordered(g1$value, levels = c("fda", "trial", "case", "pre"))
g1$value
#[1] fda trial case case pre
#Levels: fda < trial < case < pre
g1$value <- ave(g1$value, g1$drug, FUN=min)
g1
# drug value
#1 a fda
#2 a fda
#3 a fda
#4 d case
#5 d case
或者用dplyr说:
g1 %>%
mutate(value = ordered(value, levels = c("fda", "trial", "case", "pre"))) %>%
group_by(drug) %>%
mutate(value = min(value))
数据集中的顺序和任何 drug
组中存在的值范围不应影响此结果:
The order in the dataset and the range of values present in any drug
group shouldn't affect this result:
g2 = data.frame (
drug = c( "a","a","a","d","d","e","e","e"),
value = c("fda","trial","case","case","pre","pre","fda","case")
)
# drug value
#1 a fda
#2 a trial
#3 a case
#4 d case
#5 d pre
#6 e pre
#7 e fda
#8 e case
g2 %>%
mutate(value = ordered(value, levels = c("fda", "trial", "case", "pre"))) %>%
group_by(drug) %>%
mutate(value = min(value))
## A tibble: 8 x 2
## Groups: drug [3]
# drug value
# <fct> <ord>
#1 a fda
#2 a fda
#3 a fda
#4 d case
#5 d case
#6 e fda
#7 e fda
#8 e fda
这篇关于重复时如何根据优先级重新分类/替换值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!