密度图与多个组 [英] Density plots with multiple groups

查看:170
本文介绍了密度图与多个组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用格子包生成类似于 densityplot()的东西,使用<$ c $使用鼠标包使用多重插补后,将鼠标移至 ggplot2 。这是一个可重现的例子:

  require(鼠标)
dt < - nhanes
impute< - 小鼠(dt,seed = 23109)
x11()
densityplot(impute)

其中产生:



我想对输出进行更多的控制(我也将此作为ggplot的学习练习)。因此,对于 bmi 变量,我试过这个:

  bar< ;  -  NULL 
for(i in 1:impute $ m){
foo< - complete(impute,i)
foo $ imp< - rep(i,nrow(foo) )
foo $ col< - rep(#000000,nrow(foo))
bar< - rbind(bar,foo)
}

imp <-rep(0,nrow(impute $ data))
col< - rep(#D55E00,nrow(impute $ data))
bar < - rbind(bar,cbind (impute $ data,imp,col))
bar $ imp< - as.factor(bar $ imp)

x11()
ggplot(bar,aes(x = bmi,group = imp,color = col))+ geom_density()
+ scale_fill_manual(labels = c(Observed,Imputed))

产生这种结果:



所以有几个问题:


  1. 颜色错了。看来我试图控制颜色是完全错误的/被忽略了

  2. 有不需要的水平线和垂直线

  3. 我希望图例显示Imputed和Observed,但是我的代码给一元运算符



带来错误无效参数此外,使用 densityplot(impute)来完成一行内完成的工作似乎有相当多的工作 - 所以我想知道我是否会以错误的方式解决这个问题完全是吗?



编辑:我应该添加第四个问题,正如@ROLO所指出的那样:

.4。

解决方案

使用ggplot2更复杂的原因是您正在使用 densityplot ,确切地说是 - mice:densityplot.mids ,而不是格子本身。此函数具有绘制内置的 mids mids 结果类的所有功能。如果要使用 lattice :: densityplot ,您会发现它至少与使用ggplot2一样多。



ado,下面是如何使用ggplot2:

  require(reshape2)
#一起获取推算数据原始数据
imp < - complete(impute,long,include = TRUE)
#融入长格式
imp < - melt(imp,c(。imp ,。id,age))
#为情节图例添加变量
imp $ Imputed< -ifelse(imp $。imp== 0,Observed, Imputed)

#剧情。确保使用stat_density而不是geom_density以
#的顺序来防止你称之为不需要的水平和垂直线条
ggplot(imp,aes(x = value,group = .imp,color = Imputed) )+
stat_density(geom =path,position =identity)+
facet_wrap(〜variable,ncol = 2,scales =free)



但是,您可以看到这些图的范围小于 densityplot 的范围。这种行为应该由 stat_density 的参数 trim 来控制,但这似乎不起作用。修正 stat_density 的代码后,我得到了下面的图:


仍然不完全与 densityplot 原来的,但更接近。

编辑:对于真正的修复,我们需要等待ggplot2的下一个主要版本,请参阅 github


I am trying to produce something similar to densityplot() from the lattice package, using ggplot2 after using multiple imputation with the mice package. Here is a reproducible example:

require(mice)
dt <- nhanes
impute <- mice(dt, seed = 23109)
x11()
densityplot(impute)

Which produces:

I would like to have some more control over the output (and I am also using this as a learning exercise for ggplot). So, for the bmi variable, I tried this:

bar <- NULL
for (i in 1:impute$m) {
    foo <- complete(impute,i)
    foo$imp <- rep(i,nrow(foo))
    foo$col <- rep("#000000",nrow(foo))
    bar <- rbind(bar,foo)
}

imp <-rep(0,nrow(impute$data))
col <- rep("#D55E00", nrow(impute$data))
bar <- rbind(bar,cbind(impute$data,imp,col))
bar$imp <- as.factor(bar$imp)

x11()
ggplot(bar, aes(x=bmi, group=imp, colour=col)) + geom_density()
+ scale_fill_manual(labels=c("Observed", "Imputed"))

which produces this:

So there are several problems with it:

  1. The colours are wrong. It seems my attempt to control the colours is completely wrong/ignored
  2. There are unwanted horizontal and vertical lines
  3. I would like the legend to show Imputed and Observed but my code gives the error invalid argument to unary operator

Moreover, it seems like quite a lot of work to do what is accomplished in one line with densityplot(impute) - so I wondered if I might be going about this in the wrong way entirely ?

Edit: I should add the fourth problem, as noted by @ROLO:

.4. The range of the plots seems to be incorrect.

解决方案

The reason it is more complicated using ggplot2 is that you are using densityplot from the mice package (mice::densityplot.mids to be precise - check out its code), not from lattice itself. This function has all the functionality for plotting mids result classes from mice built in. If you would try the same using lattice::densityplot, you would find it to be at least as much work as using ggplot2.

But without further ado, here is how to do it with ggplot2:

require(reshape2)
# Obtain the imputed data, together with the original data
imp <- complete(impute,"long", include=TRUE)
# Melt into long format
imp <- melt(imp, c(".imp",".id","age"))
# Add a variable for the plot legend
imp$Imputed<-ifelse(imp$".imp"==0,"Observed","Imputed")

# Plot. Be sure to use stat_density instead of geom_density in order
#  to prevent what you call "unwanted horizontal and vertical lines"
ggplot(imp, aes(x=value, group=.imp, colour=Imputed)) + 
    stat_density(geom = "path",position = "identity") +
    facet_wrap(~variable, ncol=2, scales="free")

But as you can see the ranges of these plots are smaller than those from densityplot. This behaviour should be controlled by parameter trim of stat_density, but this seems not to work. After fixing the code of stat_density I got the following plot:

Still not exactly the same as the densityplot original, but much closer.

Edit: for a true fix we'll need to wait for the next major version of ggplot2, see github.

这篇关于密度图与多个组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆