如何在同一个ggplot2(R)上拟合负二项式,正态和泊松密度函数,但缩放到计数数据? [英] How to fit a negative binomial, normal, and poisson density function on the same ggplot2 (R) but scaled to the count data?

查看:197
本文介绍了如何在同一个ggplot2(R)上拟合负二项式,正态和泊松密度函数,但缩放到计数数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些计数数据。我想用计数数据绘制直方图,并添加负二项式,正态和泊松密度函数,但将这些函数拟合到计数数据中。



我尝试了以下


I have some count data. I want to plot histogram with the count data and add the negative binomial, normal, and Poisson density function but fit the functions to the count data.

I tried following this example but (a) I have trouble fitting the negative binomial and poisson functions (b) No where close to scaling it to the count data level (c) Dont know how to fit all three on same graph with legends for each line (d) Also, how can I get basic stats of each fit? for example, the neg binomial fit will generate a parameter k. How can I get that to appear on the plot

set.seed(111)
counts <- rbinom(500,100,0.1) 
df <- data.frame(counts)

ggplot(df, aes(x = counts)) + 
  geom_histogram(aes(y=..density..),colour = "black", fill = "white") +
  stat_function(fun=dnorm,args=fitdistr(df$counts,"normal")$estimate)

ggplot(df, aes(x = counts)) + 
  geom_histogram(aes(y=..density..),colour = "black", fill = "white") +
  stat_function(fun=poisson,args=fitdistr(df$counts,"poisson")$estimate)

ggplot(df, aes(x = counts)) + 
  geom_histogram(aes(y=..density..),colour = "black", fill = "white") +
  stat_function(fun=dnbinom,args=fitdistr(df$counts,"dnbinom")$estimate)

解决方案

You have a few issues, first "dnbinom" is not a valid distribution for MASS::fitdistr. Secondly, MASS::fitdistr wasn't able to get a fit with the default method, so we can use method = "SANN". Thirdly, stat_function tries to evaluate dnbinom at non-integer values unless you tell it otherwise, which doesn't work.

Getting the parameters to show in the legend is a little tricky, because you'll have to estimate them outside of the ggplot call. I was lazy and used purrr::map2, but you could use some base R functions to do the same thing.

library(purrr)
library(dplyr)
norm.params <- fitdistr(df$counts,"normal")$estimate
poisson.params <- fitdistr(df$counts,"poisson")$estimate
negbinom.params <- fitdistr(df$counts,"negative binomial", method = "SANN")$estimate

dist.params <- map(list(Normal = norm.params,Poisson = poisson.params,`Negative Binomial` = negbinom.params),
    ~ map2(names(.),.,~ paste0(.x," = ",round(.y,2))) %>% unlist %>% paste0(.,collapse = ", ")) %>% 
    map2_chr(names(.),., ~ paste(.x,.y,sep=":\n"))

Finally, if we want to scale by counts, as found in this answer, we just define anonymous functions.

mybinwidth = 1
ggplot(df, aes(x = counts)) + 
  geom_histogram(aes(y=..count..),colour = "black", fill = "white", binwidth = mybinwidth) +
  stat_function(aes(color = "black"),fun=function(x,mean,sd) mybinwidth * nrow(df) * dnorm(x,mean, sd),
                args=fitdistr(df$counts,"normal")$estimate) +
  stat_function(aes(color = "blue"),fun=function(x,lambda) mybinwidth * nrow(df) * dpois(x,lambda), 
                args=fitdistr(df$counts,"poisson")$estimate,
                xlim=c(1,20), n=20) + 
  stat_function(aes(color = "orange"),fun=function(x,size, mu) mybinwidth * nrow(df) * dnbinom(x,size = size, mu = mu),
                args=fitdistr(df$counts,"negative binomial", method="SANN")$estimate,
                xlim=c(1,20),n=20) + 
  scale_color_manual("Distribution", values=c(black="black",blue="blue",orange="orange"),
                     labels=dist.params)

这篇关于如何在同一个ggplot2(R)上拟合负二项式,正态和泊松密度函数,但缩放到计数数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆