geom_smooth在数据子集上 [英] geom_smooth on a subset of data

查看:59
本文介绍了geom_smooth在数据子集上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是一些数据和图表:

Here is some data and a plot:

set.seed(18)
data = data.frame(y=c(rep(0:1,3),rnorm(18,mean=0.5,sd=0.1)),colour=rep(1:2,12),x=rep(1:4,each=6))

ggplot(data,aes(x=x,y=y,colour=factor(colour)))+geom_point()+ geom_smooth(method='lm',formula=y~x,se=F)

如您所见,线性回归受x = 1处的值的影响很大. 我可以为x> = 2计算线性回归,但显示x = 1的值(y等于0或1). 除线性回归外,结果图将完全相同.他们不会忍受"值对脱落= 1的影响

As you can see the linear regression is highly influenced by the values where x=1. Can I get linear regressions calculated for x >= 2 but display the values for x=1 (y equals either 0 or 1). The resulting graph would be exactly the same except for the linear regressions. They would not "suffer" from the influence of the values on abscisse = 1

推荐答案

就像geom_smooth(data=subset(data, x >= 2), ...)一样简单.该图仅适合您自己并不重要,但要意识到,如果您不提及回归的执行方式,则类似的事情会误导他人.我建议更改排除点的透明度.

It's as simple as geom_smooth(data=subset(data, x >= 2), ...). It's not important if this plot is just for yourself, but realize that something like this would be misleading to others if you don't include a mention of how the regression was performed. I'd recommend changing transparency of the points excluded.

ggplot(data,aes(x=x,y=y,colour=factor(colour)))+
geom_point(data=subset(data, x >= 2)) + geom_point(data=subset(data, x < 2), alpha=.2) +
geom_smooth(data=subset(data, x >= 2), method='lm',formula=y~x,se=F)

这篇关于geom_smooth在数据子集上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆