stat_function中的dlnorm不适合 [英] dlnorm in stat_function does not fit properly

查看:248
本文介绍了stat_function中的dlnorm不适合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在 ggplot2 中通过 stat_function()叠加一个函数,如下所述:在ggplot和stat_function()中叠加对数正态密度所以使用命令:

  ggplot(data = data,aes(x = x))+ 
geom_histogram(aes (y = ..density ..))+
stat_function(fun = dlnorm,size = 1,color ='gray')+
theme_bw()

它适用于所提供的示例,其中适合的数据是使用 rf 生成的。但是,如果我尝试将其应用于下面的数据集,则不适合。我的 stat_function 的数据集有什么问题不能适应它?他们在我想要做的事情上有一些数学错误吗?是否有我的data.frame数字类型的问题?



以下是我得到的2个结果与他们各自的数据集:



不适合:




  data < -  data.frame(x = c(83.92527,75.72644,76.44609,100.86324,87.44626, 78.37094,77.71285,94.66197,69.76701,83.93192,68.26451,71.49349,66.51735,76.72893,76.76861,81.38741,67.9929,74.44888,86.06689,76.9507,123.47084,90.56689,81.50586,74.04925,71.85926,91.60573,74.57221,68.53912,75.34062,80.65242,以及其中, 85.15228,104.06124,72.42447,75.27314,73.01164,84.94915,80.04429,86.93343,82.04338,77.70276,84.0946,84.35794,96.01299,72.26497,115.12634,74.87349,80.4077,77.33795,73.4267,68.03937,82.50726,78.13893,68.7824,85.83253,80.94278, 78.06742,75.68488,133.39636,92.89265,80.01308,187.60977,86.73605,76.10981,71.80097,78.31453,75.60157,86.07133,76.92616,71.4 8474,133.32378,78.6234,131.75722,82.31215,74.46081,73.87192,82.53808,74.79978,68.17945,112.14891,89.37358,79.76679,75.2691,86.79122,79.46324,86.15034,74.70525,71.61041,82.48748,77.10785,73.95811,76.25556,82.17103,75.97427, 80.19654,88.01052,75.10031,85.93202,78.12773,72.52136,93.67812))

适合人群:



  data < -  data.frame(x = rf(100,df1 = 7,df2 = 120))


解决方案

的平均值和<$ c的默认参数值code> dlnorm 的$ c> sd 是0和1.您必须估计实际数据集的参数。这可以通过 MASS 包中的函数 fitdistr 完成。

  library(MASS)
fit < - fitdistr(data $ x,对数正态)

现在,您可以使用 dlnorm 函数的估算值:

  ggplot(data = data,aes(x = x))+ 
geom_histogram(aes(y = ..density ..))+
stat_function (fun = dlnorm,size = 1,color ='gray',
args = list(mean = fit $ estimate [1],sd = fit $ estimate [2]))+
theme_bw


I am trying to superimpose a function via stat_function() in ggplot2 as described here: Superimposing a log-normal density in ggplot and stat_function() so using the command:

ggplot(data=data, aes(x=x)) +
  geom_histogram(aes(y = ..density..)) +
  stat_function(fun = dlnorm, size=1, color='gray') +
  theme_bw()

It works with the provided example where the data to fit to is generated with rf. However if I try to apply it to the dataset below, it does not fit. What is wrong with my data set for stat_function not to be able to fit it? Is their some mathematical mistakes in what I am trying to do? is there a problem with my data.frame number type?

Here are the 2 results I get with their respective data set:

Does not fit:

data <- data.frame(x=c(83.92527, 75.72644, 76.44609, 100.86324, 87.44626, 78.37094, 77.71285, 94.66197, 69.76701, 83.93192, 68.26451, 71.49349, 66.51735, 76.72893, 76.76861, 81.38741, 67.9929, 74.44888, 86.06689, 76.9507, 123.47084, 90.56689, 81.50586, 74.04925, 71.85926, 91.60573, 74.57221, 68.53912, 75.34062, 80.65242, 85.15228, 104.06124, 72.42447, 75.27314, 73.01164, 84.94915, 80.04429, 86.93343, 82.04338, 77.70276, 84.0946, 84.35794, 96.01299, 72.26497, 115.12634, 74.87349, 80.4077, 77.33795, 73.4267, 68.03937, 82.50726, 78.13893, 68.7824, 85.83253, 80.94278, 78.06742, 75.68488, 133.39636, 92.89265, 80.01308, 187.60977, 86.73605, 76.10981, 71.80097, 78.31453, 75.60157, 86.07133, 76.92616, 71.48474, 133.32378, 78.6234, 131.75722, 82.31215, 74.46081, 73.87192, 82.53808, 74.79978, 68.17945, 112.14891, 89.37358, 79.76679, 75.2691, 86.79122, 79.46324, 86.15034, 74.70525, 71.61041, 82.48748, 77.10785, 73.95811, 76.25556, 82.17103, 75.97427, 80.19654, 88.01052, 75.10031, 85.93202, 78.12773, 72.52136, 93.67812))

Fits:

data <- data.frame(x = rf(100, df1 = 7, df2 = 120))

解决方案

The default parameter values for mean and sd of dlnorm are 0 and 1. You have to estimate the parameters for the actual dataset. This can be done with the function fitdistr in the MASS package.

library(MASS)
fit <- fitdistr(data$x, "lognormal")

Now, you can use the estimates for the dlnorm function:

ggplot(data=data, aes(x=x)) +
      geom_histogram(aes(y = ..density..)) +
      stat_function(fun = dlnorm, size = 1, color = 'gray',
                    args = list(mean = fit$estimate[1], sd = fit$estimate[2])) +
      theme_bw() 

这篇关于stat_function中的dlnorm不适合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆