如何为R中变量的每个级别返回代码向量? [英] How to return a vector of codes for each level of a variable in R?

查看:22
本文介绍了如何为R中变量的每个级别返回代码向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含针对每种健康状况的多个诊断代码(每行一个代码).我需要为每个条件获取一个 vector 代码,并且向量需要以每个条件命名.

I have a data frame with several diagnosis codes for each health condition (one code per row). I need to get a vector of codes for each condition, and the vector needs to be named after each condition.

Condition <- as.character(c("COPD", "COPD", "COPD", "COPD", "HIV", "HIV", "HIV", "Sepsis", "Sepsis", "Sepsis", "Sepsis", "Sepsis"))
Code <- as.character(c("6A61.00", "8BPT.00", "8BPT000", "8BPT100", "E2E0.00", "E2E0100", "E2E0z00", "E2E1.00", "E2E2.00", "E2Ey.00", "E2Ez.00", "Eu84400"))
df <- data.frame(Condition, Code)

df
   Condition    Code
1       COPD 6A61.00
2       COPD 8BPT.00
3       COPD 8BPT000
4       COPD 8BPT100
5        HIV E2E0.00
6        HIV E2E0100
7        HIV E2E0z00
8     Sepsis E2E1.00
9     Sepsis E2E2.00
10    Sepsis E2Ey.00
11    Sepsis E2Ez.00
12    Sepsis Eu84400

我期望得到的:

> COPD
[1] "6A61.00" "8BPT.00" "8BPT000" "8BPT100"
> HIV
[1] "E2E0.00" "E2E0100" "E2E0z00"
> Sepsis
[1] "E2E1.00" "E2E2.00" "E2Ey.00" "E2Ez.00" "Eu84400"

但是,我不想在单独的一段代码中为每个条件创建一个 vector,就像这样:

However, I don't want to have to create a vector for each condition in an individual piece of code, like this:

COPD <- df$Code[which(df$Condition=="COPD")]
HIV <- df$Code[which(df$Condition=="HIV")]
Sepsis <- df$Code[which(df$Condition=="Sepsis")] 

是否有更好的方法来优化一段代码以同时为每个条件获得一个代码向量?(我有 ~300 个条件).

Is there a better way of optimizing a piece of code to get one vector of codes for each condition at once? (I have ~300 conditions).

此外,我不希望向量作为 factor 返回,就像它正在发生的那样:

Additionally, I don't want the vector to return as a factor, like as it is happening:

> COPD
[1] 6A61.00 8BPT.00 8BPT000 8BPT100
12 Levels: 6A61.00 8BPT.00 8BPT000 8BPT100 E2E0.00 E2E0100 E2E0z00 E2E1.00 ... Eu84400
> HIV
[1] E2E0.00 E2E0100 E2E0z00
12 Levels: 6A61.00 8BPT.00 8BPT000 8BPT100 E2E0.00 E2E0100 E2E0z00 E2E1.00 ... Eu84400
> Sepsis
[1] E2E1.00 E2E2.00 E2Ey.00 E2Ez.00 Eu84400
12 Levels: 6A61.00 8BPT.00 8BPT000 8BPT100 E2E0.00 E2E0100 E2E0z00 E2E1.00 ... Eu84400

感谢您在这方面的帮助.

I appreciate your help on that.

推荐答案

您可以使用 -

list2env(split(as.character(df$Code), df$Condition), .GlobalEnv)

COPD
#[1] "6A61.00" "8BPT.00" "8BPT000" "8BPT100"

HIV
#[1] "E2E0.00" "E2E0100" "E2E0z00"

Sepsis
#[1] "E2E1.00" "E2E2.00" "E2Ey.00" "E2Ez.00" "Eu84400"

然而,在全球环境中创建如此多的独立向量并不是一个好的做法.他们很难管理.最好将它们保留在列表中.

However, it is not considered a good practice to create so many independent vectors in global environment. They are difficult to manage. It is better to keep them in list itself.

这篇关于如何为R中变量的每个级别返回代码向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆