在没有"timevar"的情况下转置/重塑数据帧.从长格式到宽格式 [英] Transpose / reshape dataframe without "timevar" from long to wide format

查看:111
本文介绍了在没有"timevar"的情况下转置/重塑数据帧.从长格式到宽格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个遵循以下较长模式的数据框:

I have a data frame that follows the below long Pattern:

   Name          MedName
  Name1    atenolol 25mg
  Name1     aspirin 81mg
  Name1 sildenafil 100mg
  Name2    atenolol 50mg
  Name2   enalapril 20mg

并且想得到下面的内容(我不在乎是否可以以这种方式命名列,只希望数据采用这种格式):

And would like to get below (I do not care if I can get the columns to be named this way, just want the data in this format):

   Name   medication1    medication2      medication3
  Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg
  Name2 atenolol 50mg enalapril 20mg             NA

通过这个网站,我对reshape/reshape2软件包非常熟悉,并经历了几次尝试来使它工作,但到目前为止都失败了.

Through this very site I have become familiarish with the reshape/reshape2 package, and have went through several attempts to try to get this to work but have thus far failed.

当我尝试dcast(dataframe, Name ~ MedName, value.var='MedName')时,我只得到一堆列药物名称的标志(被转置的值为1或0)示例:

When I try dcast(dataframe, Name ~ MedName, value.var='MedName') I just get a bunch of columns that are flags of the medication names (values that get transposed are 1 or 0) example:

 Name  atenolol 25mg  aspirin 81mg
Name1              1             1
Name2              0             0 

在融化数据集之后,我还尝试了dcast(dataset, Name ~ variable),但这只是吐出了以下内容(只是计算每个人有多少药):

I also tried a dcast(dataset, Name ~ variable) after I melted the dataset, however this just spits out the following (just counts how many meds each person has):

 Name  MedName
Name1        3
name2        2

最后,我尝试融合数据,然后使用idvar="Name" timevar="variable"(其中所有都是Mednames)进行重塑,但是这似乎不是针对我的问题构建的,因为如果idvar有多个匹配项,重塑只是采用第一个MedName,而忽略其余的.

Finally, I tried to melt the data and then reshape using idvar="Name" timevar="variable" (of which all just are Mednames), however this does not seem built for my issue since if there are multiple matches to the idvar, the reshape just takes the first MedName and ignores the rest.

有人知道如何使用整形或其他R函数来做到这一点吗?我意识到,也许可以通过一些for循环和有条件的方法来以更混乱的方式执行此操作,以便基本上拆分并重新粘贴数据,但是我希望有一个更简单的解决方案.非常感谢!

Does anyone know how to do this using reshape or another R function? I realize that there probably is a way to do this in a more messy manner with some for loops and conditionals to basically split and re-paste the data, but I was hoping there was a more simple solution. Thank you so much!

推荐答案

假定您的数据位于对象dataset中:

Assuming your data is in the object dataset:

library(plyr)
## Add a medication index
data_with_index <- ddply(dataset, .(Name), mutate, 
                         index = paste0('medication', 1:length(Name)))    
dcast(data_with_index, Name ~ index, value.var = 'MedName')

##    Name   medication1    medication2      medication3
## 1 Name1 atenolol 25mg   aspirin 81mg sildenafil 100mg
## 2 Name2 atenolol 50mg enalapril 20mg             <NA>

这篇关于在没有"timevar"的情况下转置/重塑数据帧.从长格式到宽格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆