使用stat_function()在R中的ggplot中绘制大量自定义函数 [英] Plotting a large number of custom functions in ggplot in R using stat_function()

查看:676
本文介绍了使用stat_function()在R中的ggplot中绘制大量自定义函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本问题是我想知道如何将大量(1000)自定义函数添加到ggplot中的同一图形中,使用函数系数的不同值。我已经看到关于如何添加两个或三个函数的其他问题,但不是1000,以及关于以不同功能形式添加的问题,但不是具有多个参数值的相同形式...



目标是让stat_function使用存储在数据框中的参数值绘制线条,但没有x的实际数据。 这里的总体目标是从小数据集中显示非线性回归的模型参数中的大的不确定性,这转化为与来自该数据的预测相关的不确定性(我试图说服别人是一个坏主意)。我经常通过绘制模型参数中不确定性构建的许多线条来做到这一点(一种安德鲁·盖尔曼的多层次回归教科书)。] 举例来说,这里是

 #数据
p.gap< - c(50,45, 57,43,32,30,14,36,51)
p.ag < - c(43,44,52,46,28,17,7,18,29)
数据< ; - as.data.frame(cbind(p.ag,p.gap))

#模型(使用非线性最小二乘回归):
fit.1.nls< ; - nls(公式= p.gap〜beta1 * p.ag ^(beta2),start = list(beta1 = 5.065,beta2 = 0.6168))
summary(fit.1.nls)

#从总结中,我发现手段和s.e是两个参数,并且发展它们的分布:
beta1 < - rnorm(1000,7.8945,3.5689)
beta2 < - rnorm (1000,0.4894,0.1282)
coefs< - as.data.frame(cbind(beta1,beta2))

这是我想要的曲线(使用curve()和base R图形):
plot(data $ p.ag,data $ p.gap,xlab =%farm use use,
ylab =% (i = 1:1000){曲线(coefs [i,1,ylim = c(0,130),pch = 20,type =n)
。 ] * x ^(coefs [i,2]),add = T,col =gray)}
曲线(coef(fit.1.nls)[[1]] * x ^(coef(fit
点(数据$ p.ag,数据$ p.gap,pch = 20)

我可以用ggplot中的数据绘制平均模型函数:

<$ (x){7.8945 * x ^(0.4894)}
ggplot(data,aes(x = p.ag,y = p.gap)p $ p> fit.mean< )+
scale_x_continuous(限制= c(0,100),土地使用%)+
scale_y_continuous(限制= c(0,100),河岸缓冲区差距%)+
stat_function (fun = fit.mean,color =red)+
geom_point()

但是我没有在ggplot中绘制多行。我似乎找不到任何帮助从ggplot网站或本网站上的函数绘制参数值,这通常都非常有帮助。这是否违反了足够的阴谋论,没有人敢这样做?



任何帮助表示赞赏。谢谢!

解决方案

可以将多个geoms或stats(甚至是剧情的其他元素)列表并将该矢量/列表添加到绘图。使用这个, plyr 包可以用来列出 stat_function ,每行 coefs

  library(plyr)
coeflines< - $ (系数),1,函数(coef){
stat_function(fun = function(x){coef [1] * x ^ coef [2]},color =gray )
))

然后将这个添加到图中

  ggplot(数据,aes(x = p.ag,y = p.gap))+ 
scale_x_continuous(限制= c(0,100), %ag土地使用)+
scale_y_continuous(限制= c(0,100),%河岸缓冲区差距)+
coeflines +
stat_function(fun = fit.mean,color =红色)+
geom_point()


一些注释:




  • 这很慢。我的电脑上画了几分钟。 ggplot 在处理大约1000个图层时的效率并不高。 这只是添加1000行的地址。 Per @ Roland的评论,我不知道这是否代表你希望/期望它在统计上。


The basic issue is that I'd like to figure out how to add a large number (1000) custom functions into the same figure in ggplot, using different values for the function coefficients. I have seen other questions about how to add two or three functions, but not 1000, and questions about adding in different functional forms, but not the same form with multiple values for the parameters...

The goal is to have stat_function draw the lines over using parameters values stored in a data frame, but with no actual data for x.

[The overall goal here is to show the large uncertainty in the model parameters of a non-linear regression from a small dataset, which translates into uncertainty associated with predictions from this data (which I'm trying to convince someone else is a bad idea). I often do this by plotting many lines built from the uncertainty in the model parameters, (a la Andrew Gelman's Multilevel Regression textbook).]

As an example, here is the plot in the base R graphics.

#The data
p.gap <- c(50,45,57,43,32,30,14,36,51)
p.ag <- c(43,24,52,46,28,17,7,18,29)
data <- as.data.frame(cbind(p.ag, p.gap))

#The model (using non-linear least squares regression):
fit.1.nls <- nls(formula=p.gap~beta1*p.ag^(beta2), start=list(beta1=5.065, beta2=0.6168))
summary(fit.1.nls)

#From the summary, I find the means and s.e's the two parameters, and develop their distributions:
beta1 <- rnorm(1000, 7.8945, 3.5689)
beta2 <- rnorm(1000, 0.4894, 0.1282)
coefs <- as.data.frame(cbind(beta1,beta2))

#This is the plot I want (using curve() and base R graphics):
plot(data$p.ag, data$p.gap, xlab="% agricultural land use",
     ylab="% of riparian buffer gap", xlim=c(0,130), ylim=c(0,130), pch=20, type="n")
for (i in 1:1000){curve(coefs[i,1]*x^(coefs[i,2]), add=T, col="grey")}
curve(coef(fit.1.nls)[[1]]*x^(coef(fit.1.nls)[[2]]), add=T, col="red")
points(data$p.ag, data$p.gap, pch=20)

I can plot the mean model function with the data in ggplot:

fit.mean <- function(x){7.8945*x^(0.4894)}
ggplot(data, aes(x=p.ag, y=p.gap)) +
  scale_x_continuous(limits=c(0,100), "% ag land use") +
  scale_y_continuous(limits=c(0,100), "% riparian buffer gap") +
  stat_function(fun=fit.mean, color="red") +
  geom_point()

But nothing I do draws multiple lines in ggplot. I can't seem to find any help on drawing the parameter values from of functions on the ggplot website, or on this site, which are both usually very helpful. Does this violate enough plotting theory that no one dares do this?

Any help is appreciated. Thank you!

解决方案

It is possible to collect multiple geoms or stats (and even other elements of a plot) into a vector or list and add that vector/list to the plot. Using this, the plyr package can be used to make a list of stat_function, one for each row of coefs

library("plyr")
coeflines <-
alply(as.matrix(coefs), 1, function(coef) {
  stat_function(fun=function(x){coef[1]*x^coef[2]}, colour="grey")
})

Then just add this to the plot

ggplot(data, aes(x=p.ag, y=p.gap)) +
  scale_x_continuous(limits=c(0,100), "% ag land use") +
  scale_y_continuous(limits=c(0,100), "% riparian buffer gap") +
  coeflines +
  stat_function(fun=fit.mean, color="red") +
  geom_point()

A couple of notes:

  • This is slow. It took a few minutes on my computer to draw. ggplot was not designed to be very efficient at handling circa 1000 layers.
  • This just addresses adding the 1000 lines. Per @Roland's comment, I don't know if this represents what you want/expect it to statistically.

这篇关于使用stat_function()在R中的ggplot中绘制大量自定义函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆