散点图的连续分位数 [英] Continuous quantiles of a scatterplot

查看:219
本文介绍了散点图的连续分位数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,为此我绘制了一个回归(使用 ggplot2 stat_smooth ):

  ggplot(data = mydf,aes(x = time,y = pdm))+ geom_point()+ stat_smooth(col =红色)



我也想用同样的方法得到分位数(如果它更简单,只有四分位数可以)。我设法得到的是以下内容:

  ggplot(data = mydf,aes(x = time,y = pdm,z =表面))+ geom_point()+ stat_smooth(col =red)+ stat_quantile(quantiles = c(0.25,0.75))



不幸的是,我不能把 method =在 stat_quantile()中的黄土,如果我没弄错的话,它会解决我的问题。


(如果不清楚,所需行为=分位数的非线性回归,因此Q25和Q75的回归分别低于和高于(分别)我的红色曲线(如果绘制Q50和Q50)我的红色曲线))。



谢谢

解决方案

stat_quantile 默认情况下绘制每个x值第25和第75百分位数的最佳拟合线。 stat_quantile 使用 quantreg 包中的 rq 函数(隐式地,在 stat_quantile 调用中 method =rq)。据我所知, rq 不会做黄土回归。但是,您可以使用其他灵活的函数进行分位数回归。这里有两个例子:

B-Spline:

  library( splines)

stat_quantile(公式= y〜bs(x,df = 4),quantiles = c(0.25,0.75))

二阶多项式:

  stat_quantile(formula = y〜poly x,2),分位数= c(0.25,0.75))

stat_quantile 仍然使用 rq ,但是 rq 接受上面列出的类型的公式不提供公式,那么 stat_quantile 隐式使用 formula = y〜x )。如果您在 geom_smooth 中使用与 stat_quantile 相同的公式,那么您将有一致的回归方法用于分位数和平均预期。


I have a data set, for which I graphed a regression (using ggplot2's stat_smooth) :

ggplot(data = mydf, aes(x=time, y=pdm)) + geom_point() + stat_smooth(col="red") 

I'd also like to have the quantiles (if it's simpler, having only the quartiles will do) using the same method. All I manage to get is the following :

ggplot(data = mydf, aes(x=time, y=pdm, z=surface)) + geom_point() + stat_smooth(col="red") + stat_quantile(quantiles = c(0.25,0.75)) 

Unfortunately, I can't put method="loess" in stat_quantile(), which, if I'm not mistaken, would solve my problem.

(In case it's not clear, desired behavior = non linear regressions for the quantiles, and therefore the regression for Q25 and Q75 being below and above (respectively) my red curve (and Q50, if plotted, would be my red curve)).

Thanks

解决方案

stat_quantile is, by default, plotting best-fit lines for the 25th and 75th percentiles at each x-value. stat_quantile uses the rq function from the quantreg package (implicitly, method="rq" in the stat_quantile call). As far as I know, rq doesn't do loess regression. However, you can use other flexible functions for the quantile regression. Here are two examples:

B-Spline:

library(splines)

stat_quantile(formula=y ~ bs(x, df=4), quantiles = c(0.25,0.75))

Second-Order Polynomial:

stat_quantile(formula=y ~ poly(x, 2), quantiles = c(0.25,0.75))

stat_quantile is still using rq, but rq accepts formulas of the type listed above (if you don't supply a formula, then stat_quantile is implicitly using formula=y~x). If you use the same formula in geom_smooth as for stat_quantile, you'll have consistent regression methods being used for the quantiles and for the mean expectation.

这篇关于散点图的连续分位数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆