复杂的重塑 [英] complicated reshaping

查看：161 发布时间：2017/3/25 23:47:21 r dataframe reshape2

本文介绍了复杂的重塑的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从长到宽的格式重塑我的数据框，并且松开了我想保留的一些数据。
对于以下示例：

  df< -data.frame（Par1 = unlist（strsplit（AABBCCC ）），
 Par2 = unlist（strsplit（DDEEFFF，）），
 ParD = unlist（strsplit（foo，bar，baz，qux，bla，xyz，meh ，）），
 Type = unlist（strsplit（pre，post，pre，post，pre，post，post，，）），
 Val = c（10,20， 30,40,50,60,70））
 
＃Par1 Par2 ParD Type Val 
＃1 AD foo pre 10 
＃2 AD bar post 20 
＃ 3 BE baz pre 30 
＃4 BE qux post 40 
＃5 CF bla pre 50 
＃6 CF xyz post 60 
＃7 CF meh post 70 
 
 dfw< -dcast（df，
 formula = Par1 + Par2〜Type，
 value.var =Val，
 fun.aggregate = mean）
 
＃Par1 Par2 post pre 
＃1 AD 20 10 
＃2 BE 40 30 
＃3 CF 65 50

这几乎是wha我需要，但我想要有一些字段保留来自ParD字段的数据（例如，作为单个合并字符串），

用于聚合的观察次数。

ie我希望得到的data.frame如下：

 ＃Par1 Par2 post pre Num.pre Num.post ParD 
＃1 AD 20 10 1 1 foo_bar 
＃2 BE 40 30 1 1 baz_qux 
＃3 CF 65 50 1 2 bla_xyz_meh

我会感谢任何想法。例如，我试图通过在dcast中编写来解决第二个任务： fun.aggregate = function（x）c（Val = mean（x），Num = length（x）） - 但这会导致错误...

提前感谢！

解决方案

使用 ddply 的两个步骤解决方案（我不满意，但我得到结果）

  dat < -  ddply（df，。（Par1，Par2），function（x）{
 data.frame（ParD = paste（paste（x $ ParD） collapse ='_'），
 Num.pre = length（x $ Type [x $ Type =='pre']），
 Num.post = length（x $ Type [x $ Type = ='post']））
}）
 
合并（dfw，dat）
 Par1 Par2 post pre ParD Num.pre Num.post 
 1 AD 2.0 1 foo_bar 1 1 
 2 BE 4.0 3 baz_qux 1 1 
 3 CF 6.5 5 bla_xyz_meh 1 2

I want to reshape my dataframe from long to wide format and I loose some data that I'd like to keep. For the following example:

df<-data.frame(Par1=unlist(strsplit("AABBCCC","")),
               Par2=unlist(strsplit("DDEEFFF","")),
               ParD=unlist(strsplit("foo,bar,baz,qux,bla,xyz,meh",",")),
               Type=unlist(strsplit("pre,post,pre,post,pre,post,post",",")),
               Val=c(10,20,30,40,50,60,70))

   #     Par1 Par2 ParD Type Val
   #   1    A    D  foo  pre  10
   #   2    A    D  bar post  20
   #   3    B    E  baz  pre  30
   #   4    B    E  qux post  40
   #   5    C    F  bla  pre  50
   #   6    C    F  xyz post  60
   #   7    C    F  meh post  70

dfw<-dcast(df,
       formula = Par1+Par2~Type,
       value.var="Val",
       fun.aggregate=mean)

 #     Par1 Par2 post pre
 #   1    A    D   20  10
 #   2    B    E   40  30
 #   3    C    F   65  50

this is almost what I need but I would like to have

some field keeping data from ParD field (for example, as single merged string),
number of observations used for aggregation.

i.e. I would like the resulting data.frame to be as follows:

    #     Par1 Par2 post pre Num.pre Num.post ParD
    #   1    A    D   20  10      1      1    foo_bar 
    #   2    B    E   40  30      1      1    baz_qux
    #   3    C    F   65  50      1      2    bla_xyz_meh

I would be grateful for any ideas. For example, I tried to solve the second task by writing in dcast: fun.aggregate=function(x) c(Val=mean(x),Num=length(x)) - but this causes an error...

Thanks in advance!

解决方案

Solution in 2 steps using ddply ( i am not happy with , but I get the result)

dat <- ddply(df,.(Par1,Par2),function(x){
  data.frame(ParD=paste(paste(x$ParD),collapse='_'),
             Num.pre =length(x$Type[x$Type =='pre']),
             Num.post = length(x$Type[x$Type =='post']))
})

merge(dfw,dat)
 Par1 Par2 post pre        ParD Num.pre Num.post
1    A    D  2.0   1     foo_bar       1        1
2    B    E  4.0   3     baz_qux       1        1
3    C    F  6.5   5 bla_xyz_meh       1        2

这篇关于复杂的重塑的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

复杂的重塑 [英] complicated reshaping

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

复杂的重塑 [英] complicated reshaping

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭