R散点图:符号颜色代表重叠点的数量 [英] R Scatter Plot: symbol color represents number of overlapping points

查看:608
本文介绍了R散点图:符号颜色代表重叠点的数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当许多点重叠时,散点图可能很难解释,因为这样的重叠会遮盖特定区域中的数据密度.一种解决方案是对绘制的点使用半透明的颜色,以使不透明区域指示在这些坐标中存在许多观测值.

Scatter plots can be hard to interpret when many points overlap, as such overlapping obscures the density of data in a particular region. One solution is to use semi-transparent colors for the plotted points, so that opaque region indicates that many observations are present in those coordinates.

下面是我在R中的黑白解决方案的一个示例:

Below is an example of my black and white solution in R:

MyGray <- rgb(t(col2rgb("black")), alpha=50, maxColorValue=255)
x1 <- rnorm(n=1E3, sd=2)
x2 <- x1*1.2 + rnorm(n=1E3, sd=2)
dev.new(width=3.5, height=5)
par(mfrow=c(2,1), mar=c(2.5,2.5,0.5,0.5), ps=10, cex=1.15)
plot(x1, x2, ylab="", xlab="", pch=20, col=MyGray)
plot(x1, x2, ylab="", xlab="", pch=20, col="black")

但是,我最近在PNAS中碰到了这篇文章 ,它采用了类似的方法,但是使用了热图着色而不是不透明性来指示有多少个点重叠.本文是Open Access,因此任何人都可以下载.pdf并查看图1,其中包含我要创建的图形的相关示例.本文的方法部分表明分析是在Matlab中完成的.

However, I recently came across this article in PNAS, which took a similar a approach, but used heat-map coloration as opposed to opacity as an indicator of how many points were overlapping. The article is Open Access, so anyone can download the .pdf and look at Figure 1, which contains a relevant example of the graph I want to create. The methods section of this paper indicates that analyses were done in Matlab.

为方便起见,这是上述文章中图1的一小部分:

For the sake of convenience, here is a small portion of Figure 1 from the above article:

我该如何在R中创建一个使用颜色而不是不透明度作为点密度指标的散点图?

How would I create a scatter plot in R that used color, not opacity, as an indicator of point density?

对于初学者,R用户可以使用功能tim.colors()install.packages("fields")库中访问此Matlab配色方案.

For starters, R users can access this Matlab color scheme in the install.packages("fields") library, using the function tim.colors().

是否有一种简单的方法可以使图形类似于以上文章的图1,但使用R?谢谢!

Is there an easy way to make a figure similar to Figure 1 of the above article, but in R? Thanks!

推荐答案

一种选择是使用densCols()在每个点上提取内核密度.将这些密度映射到所需的色带,然后按增加局部密度的顺序绘制点,就可以得到与链接文章中的图非常相似的图.

One option is to use densCols() to extract kernel densities at each point. Mapping those densities to the desired color ramp, and plotting points in order of increasing local density gets you a plot much like those in the linked article.

## Data in a data.frame
x1 <- rnorm(n=1E3, sd=2)
x2 <- x1*1.2 + rnorm(n=1E3, sd=2)
df <- data.frame(x1,x2)

## Use densCols() output to get density at each point
x <- densCols(x1,x2, colramp=colorRampPalette(c("black", "white")))
df$dens <- col2rgb(x)[1,] + 1L

## Map densities to colors
cols <-  colorRampPalette(c("#000099", "#00FEFF", "#45FE4F", 
                            "#FCFF00", "#FF9400", "#FF3100"))(256)
df$col <- cols[df$dens]

## Plot it, reordering rows so that densest points are plotted on top
plot(x2~x1, data=df[order(df$dens),], pch=20, col=col, cex=2)

这篇关于R散点图:符号颜色代表重叠点的数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆