如何在ggplot中重现smoothScatter的异常点绘图? [英] How to reproduce smoothScatter's outlier plotting in ggplot?

查看:468
本文介绍了如何在ggplot中重现smoothScatter的异常点绘图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图得到像 smoothScatter 函数那样的东西,只在ggplot中。除了绘制N个最稀疏的点之外,我已经计算出了所有的东西。任何人都可以帮助我吗?

  library(grDevices)
library(ggplot2)

)创建两个新设备
dev.new()
dev1< - dev.cur()
dev.new()
dev2< - dev.cur()

#制作一些需要在对数尺度上绘制的数据
mydata < - data.frame(x = exp(rnorm(10000)),y = exp(rnorm(10000) ))

#绘制smoothScatter版本
dev.set(dev1)
with(mydata,smoothScatter(log10(y)〜log10(x)))

#绘制ggplot版本
dev.set(dev2)
ggplot(mydata)+ aes(x = x,y = y)+ scale_x_log10()+ scale_y_log10()+
)stat_density2d(geom =tile,aes(fill = .. density .. ^ 0.25),contour = FALSE)+
scale_fill_gradientn(colors = colorRampPalette(c(white,blues9))(256))

请注意,在基本图形版本中,100个稀疏点绘制在平滑密度情节。稀疏性由在该点坐标处的核密度估计的值定义,并且重要的是,在对数变换(或任何其他坐标变换)之后计算核密度估计。我可以通过添加 + geom_point(size = 0.5)来绘制所有点,但我只想要稀疏点。



有什么办法可以用ggplot来完成这个任务吗?这其实有两个部分。首先是弄清楚坐标变换后的离群值是什么,第二是只画出那些点。

解决方案

这是一种解决方法!对最小密度的n点不起作用,但绘制密度比0.25小0.25的点。

它实际上绘制了 stat_density2d()图层,然后是 geom_point(,然后是 stat_density2d(),使用alpha在密度^ 0.25以上的最后一层的中间创建一个透明的洞例子)0.4。



显然,你有三次运行的表现。

 #绘制ggplot版本
ggplot(mydata)+ aes(x = x,y = y)+ scale_x_log10()+ scale_y_log10()+
stat_density2d(geom =tile,aes (fill = .. density .. ^ 0.25,alpha = 1),contour = FALSE)+
geom_point(size = 0.5)+
stat_density2d(geom =tile,aes(fill = ..) (色彩= colorRampPalette(c(白色,蓝色9)),其中, )(256))


I am trying to get something like what the smoothScatter function does, only in ggplot. I have figured out everything except for plotting the N most sparse points. Can anyone help me with this?

library(grDevices)
library(ggplot2)

# Make two new devices
dev.new()
dev1 <- dev.cur()
dev.new()
dev2 <- dev.cur()

# Make some data that needs to be plotted on log scales
mydata <- data.frame(x=exp(rnorm(10000)), y=exp(rnorm(10000)))

# Plot the smoothScatter version
dev.set(dev1)
with(mydata, smoothScatter(log10(y)~log10(x)))

# Plot the ggplot version
dev.set(dev2)
ggplot(mydata) + aes(x=x, y=y) + scale_x_log10() + scale_y_log10() + 
  stat_density2d(geom="tile", aes(fill=..density..^0.25), contour=FALSE) +
  scale_fill_gradientn(colours = colorRampPalette(c("white", blues9))(256))

Notice how in the base graphics version, the 100 most "sparse" points are plotted over the smoothed density plot. Sparseness is defined by the value of the kernel density estimate at the point's coordinate, and importantly, the kernel density estimate is calculated after the log transform (or whatever other coordinate transform). I can plot all points by adding + geom_point(size=0.5), but I only want the sparse ones.

Is there any way to accomplish this with ggplot? There are really two parts to this. The first is to figure out what the outliers are after coordinate transforms, and the second is to plot only those points.

解决方案

Here is a workaround of sorts! Is doesn't work on the least dense n points, but plots all points with a density^0.25 less than x.

It actually plots the stat_density2d() layer, then the geom_point(, then the stat_density2d(), using alpha to create a transparent "hole" in the middle of the last layer where the density^0.25 is above (in this case) 0.4.

Obviously you have the performance hit of running three plots.

# Plot the ggplot version
ggplot(mydata) + aes(x=x, y=y) + scale_x_log10() + scale_y_log10() + 
  stat_density2d(geom="tile", aes(fill=..density..^0.25, alpha=1), contour=FALSE) + 
  geom_point(size=0.5) +
  stat_density2d(geom="tile", aes(fill=..density..^0.25,     alpha=ifelse(..density..^0.25<0.4,0,1)), contour=FALSE) + 
  scale_fill_gradientn(colours = colorRampPalette(c("white", blues9))(256))

这篇关于如何在ggplot中重现smoothScatter的异常点绘图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆