没有“timevar"的转置/重塑数据帧从长格式到宽格式 [英] Transpose / reshape dataframe without "timevar" from long to wide format
问题描述
我有一个遵循以下长模式的数据框:
I have a data frame that follows the below long Pattern:
Name MedName
Name1 atenolol 25mg
Name1 aspirin 81mg
Name1 sildenafil 100mg
Name2 atenolol 50mg
Name2 enalapril 20mg
并且想要下面(我不在乎是否可以以这种方式命名列,只想要这种格式的数据):
And would like to get below (I do not care if I can get the columns to be named this way, just want the data in this format):
Name medication1 medication2 medication3
Name1 atenolol 25mg aspirin 81mg sildenafil 100mg
Name2 atenolol 50mg enalapril 20mg NA
通过这个网站,我已经熟悉了 reshape/reshape2 包,并尝试过几次尝试让它工作,但迄今为止都失败了.
Through this very site I have become familiarish with the reshape/reshape2 package, and have went through several attempts to try to get this to work but have thus far failed.
当我尝试 dcast(dataframe, Name ~ MedName, value.var='MedName')
时,我只得到一堆作为药物名称标志的列(转置的值为 1或 0) 示例:
When I try dcast(dataframe, Name ~ MedName, value.var='MedName')
I just get a bunch of columns that are flags of the medication names (values that get transposed are 1 or 0) example:
Name atenolol 25mg aspirin 81mg
Name1 1 1
Name2 0 0
我在融化数据集后也尝试了 dcast(dataset, Name ~ variable)
,但是这只是吐出以下内容(只计算每个人有多少药):
I also tried a dcast(dataset, Name ~ variable)
after I melted the dataset, however this just spits out the following (just counts how many meds each person has):
Name MedName
Name1 3
name2 2
最后,我尝试融化数据,然后使用 idvar="Name"
timevar="variable"
(其中都只是 Mednames)重塑,但是这似乎不是针对我的问题而构建的,因为如果有多个匹配的 idvar,则重塑仅采用第一个 MedName 并忽略其余部分.
Finally, I tried to melt the data and then reshape using idvar="Name"
timevar="variable"
(of which all just are Mednames), however this does not seem built for my issue since if there are multiple matches to the idvar, the reshape just takes the first MedName and ignores the rest.
有谁知道如何使用 reshape 或其他 R 函数来做到这一点?我意识到可能有一种方法可以通过一些 for 循环和条件来以更混乱的方式执行此操作,以基本上拆分和重新粘贴数据,但我希望有一个更简单的解决方案.非常感谢!
Does anyone know how to do this using reshape or another R function? I realize that there probably is a way to do this in a more messy manner with some for loops and conditionals to basically split and re-paste the data, but I was hoping there was a more simple solution. Thank you so much!
推荐答案
假设你的数据在对象 dataset
中:
Assuming your data is in the object dataset
:
library(plyr)
## Add a medication index
data_with_index <- ddply(dataset, .(Name), mutate,
index = paste0('medication', 1:length(Name)))
dcast(data_with_index, Name ~ index, value.var = 'MedName')
## Name medication1 medication2 medication3
## 1 Name1 atenolol 25mg aspirin 81mg sildenafil 100mg
## 2 Name2 atenolol 50mg enalapril 20mg <NA>
这篇关于没有“timevar"的转置/重塑数据帧从长格式到宽格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!