任何关于如何使用ggplot2绘制mixEM类型数据的建议 [英] Any suggestions for how I can plot mixEM type data using ggplot2

查看:439
本文介绍了任何关于如何使用ggplot2绘制mixEM类型数据的建议的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个从我原始数据中获得的1米记录样本。 (作为参考,你可以使用这个可能产生大致类似分布的伪数据。 (rnorm(2000000,mean = c(8,17),sd = 2)))
c <-b [sample(nrow(b),1000000),]


我认为直方图是两个对数正态分布的混合,我尝试使用EM算法来拟合总和分布以下代码:

  install.packages(mixtools)
lib(mixtools)
#line (c1,density = TRUE)以下返回正态分布混合的mixEM []类型的EM输出
c1 < - normalmixEM(c,lambda = NULL,mu = NULL,sigma = NULL)

第一个图是一个对数似然图,第二个图(如果再次点击返回)类似于以下密度曲线:



正如我所提到的,c1是t ype mixEM []和plot()函数可以适应这种情况。我想用颜色填充密度曲线。使用ggplot2()很容易,但ggplot2()不支持mixEM []类型的数据并抛出以下消息:处理类mixEM的数据是否有任何其他方法可以解决此问题?任何建议都非常感谢!!



谢谢! 在返回对象的结构中(这应该在帮助中记录):

 > #简单的法线混合:
> x = c(rnorm(10000,8,2),rnorm(10000,17,4))
> xMix = normalmixEM(x,lambda = NULL,mu = NULL,sigma = NULL)



 > str(xMix)
清单9
$ x:num [1:20000] 6.18 9.92 9.07 8.84 9.93 ...
$ lambda:num [1:2] 0.502 0.498
$ mu:num [1:2] 7.99 17.05
$ sigma:num [1:2] 2.03 4.02
$ loglik:num -59877
pre>

lambda,mu和sigma组件定义返回的正常密度。您可以使用 qplot stat_function 在ggplot中绘制这些图。但首先创建一个函数,返回缩放的正常密度:

pre $ s $ c $ sdnorm
函数(x,mean = 0,sd = 1,lambda = 1){lambda * dnorm(x,mean = mean,sd = sd)}



<

  qplot(x,geom =density)+ stat_function(fun = sdnorm,arg = list(mean = xMix $ mu [1],sd = xMix $ sigma [1],lambda = xMix $ lambda [1]),fill =blue,geom =polygon)+ stat_function(fun = sdnorm,arg = list mean = xMix $ mu [2],sd = xMix $ sigma [2],lambda = xMix $ lambda [2]),fill =#FF0000,geom =polygon)



或者其他任何 ggplot 技能。

  ggplot(data.frame(x = x))+ 
geom_histogram (aes(x = x,y = .. density ..),fill =white,color =black)+
stat_function(fun = sdnorm,
arg = list(mean = xMix $ mu [2],
sd = xMix $ sigma [2],
lambda = xMix $ lambda [2]),
fill =#FF000080,geom =polygon) +
stat_function(fun = sdnorm,
arg = list(mean = xMix $ mu [1],
sd = xMix $ sigma [1],
lambda = xMix $ lambda [1]),
fill =#00FF0080,geom =polygon)

生产:



p>

I have a sample of 1m records obtained from my original data. (For your reference, you may use this dummy data that may generate approximately similar distribution

b <- data.frame(matrix(rnorm(2000000, mean=c(8,17), sd=2)))
c <- b[sample(nrow(b), 1000000), ]

) I believed the histogram to be a mixture of two log-normal distributions and I tried to fit the summed distributions using EM algorithm using the following code:

install.packages("mixtools")
lib(mixtools)
#line below returns EM output of type mixEM[] for mixture of normal distributions
c1 <- normalmixEM(c, lambda=NULL, mu=NULL, sigma=NULL) 
plot(c1, density=TRUE)

The first plot is a log-likelihood plot and the second (if you hit return again), gives similar to the following density curves:

As I mentioned c1 is of type mixEM[] and plot() function can accommodate that. I want to fill the density curves with colors. This is easy to do using ggplot2() but ggplot2() does not support data of type mixEM[] and throws this message:

"ggplot doesn't know how to deal with data of class mixEM" Is there any other approach I can take for this problem? Any suggestions are greatly appreciated!!

Thanks!

解决方案

Look at the structure of the returned object (this should be documented in the help):

> # simple mixture of normals:
> x=c(rnorm(10000,8,2),rnorm(10000,17,4))
> xMix = normalmixEM(x, lambda=NULL, mu=NULL, sigma=NULL)

Now what:

> str(xMix)
List of 9
 $ x         : num [1:20000] 6.18 9.92 9.07 8.84 9.93 ...
 $ lambda    : num [1:2] 0.502 0.498
 $ mu        : num [1:2] 7.99 17.05
 $ sigma     : num [1:2] 2.03 4.02
 $ loglik    : num -59877

The lambda, mu, and sigma components define the returned normal densities. You can plot these in ggplot using qplot and stat_function. But first make a function that returns scaled normal densities:

sdnorm =
function(x, mean=0, sd=1, lambda=1){lambda*dnorm(x, mean=mean, sd=sd)}

Then:

qplot(x,geom="density") + stat_function(fun=sdnorm,arg=list(mean=xMix$mu[1],sd=xMix$sigma[1], lambda=xMix$lambda[1]),fill="blue",geom="polygon")  + stat_function(fun=sdnorm,arg=list(mean=xMix$mu[2],sd=xMix$sigma[2], lambda=xMix$lambda[2]),fill="#FF0000",geom="polygon") 

Or whatever ggplot skills you have. Transparent colours on the densities might be nice.

ggplot(data.frame(x=x)) + 
 geom_histogram(aes(x=x,y=..density..),fill="white",color="black") +
 stat_function(fun=sdnorm,
    arg=list(mean=xMix$mu[2],
             sd=xMix$sigma[2],
             lambda=xMix$lambda[2]),
             fill="#FF000080",geom="polygon") +
 stat_function(fun=sdnorm,
    arg=list(mean=xMix$mu[1],
             sd=xMix$sigma[1],
             lambda=xMix$lambda[1]),
             fill="#00FF0080",geom="polygon")

producing:

这篇关于任何关于如何使用ggplot2绘制mixEM类型数据的建议的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆