在R中为ggplot2中的两个变量制作分箱散点图 [英] making binned scatter plots for two variables in ggplot2 in R
问题描述
我有一个数据框,它包含两列 x
和 y
,每列包含0到100之间的值(数据配对)。我想使用装箱散点图将它们相互关联。如果我要使用常规散点图,则很容易:
geom_point(aes(x = x,y = y))
但我希望将点分为从0到100的N个分档,得到每个箱中的 x
的平均值以及该箱中的点的平均值 y
,显示作为散点图 - 所以关联分箱平均值而不是原始数据点。
有没有一种聪明/快捷的方法可以在ggplot2中做到这一点,使用某种组合 geom_smooth()
和 geom_point
?或者它是否必须手动预先计算然后绘制?
是的,您可以使用 stat_summary_bin
。
set.seed(42)
x < - runif(1e4)
y < - x ^ 2 + x + 4 * rnorm(1e4)
df < - data.frame(x = x,y = y)
library(ggplot2 )
(ggplot(DF,AES(X = X,Y = Y))+
geom_point(阿尔法= 0.4)+
stat_summary_bin(fun.y = '平均',箱= 20 ,
color ='orange',size = 2,geom ='point'))
I have a dataframe with two columns x
and y
that each contain values between 0 and 100 (the data are paired). I want to correlate them to each other using binned scatter plots. If I were to use a regular scatter plot, it would be easy to do:
geom_point(aes(x=x, y=y))
but I'd like to instead bin the points into N bins from 0 to 100, get the average value of x
in each bin and the average value of y
for the points in that bin, and show that as a scatter plot - so correlate the binned averages instead of the raw data points.
is there a clever/quick way to do this in ggplot2, using some combination of geom_smooth()
and geom_point
? Or does it have to be pre-computed manually and then plotted?
Yes, you can use stat_summary_bin
.
set.seed(42)
x <- runif(1e4)
y <- x^2 + x + 4 * rnorm(1e4)
df <- data.frame(x=x, y=y)
library(ggplot2)
(ggplot(df, aes(x=x,y=y)) +
geom_point(alpha = 0.4) +
stat_summary_bin(fun.y='mean', bins=20,
color='orange', size=2, geom='point'))
这篇关于在R中为ggplot2中的两个变量制作分箱散点图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!