循环绘制不同的数据集时,geom_dotplot点大小发生变化 [英] geom_dotplot dot sizes change when plotting different datasets in loop

查看:64
本文介绍了循环绘制不同的数据集时,geom_dotplot点大小发生变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为数据集的不同子集绘制许多点图.问题在于,跨地块的格式不相同.特别是点的大小不一样.

I'm trying to do many dot plots for different subsets of a dataset. Problem is that the format is not the same across plots. In particular, the size of the dots is not the same.

跨子集,"y"变量的范围不同.是这个原因吗?

The range of the "y" variable is not the same across subsets. Is this the reason?

rm(list=ls()) 
library(ggplot2)

outdir<-"SELECT YOUR OUTPUT DIRECTORY"

#generate subsets separately
set.seed(1)
#
data1<-rbind(
  data.frame(poll=rnorm(20,20,5),zone="zone1"),
  data.frame(poll=rnorm(20,16,1),zone="zone2"))
data1$id="ID1"

data2<-rbind(
  data.frame(poll=rnorm(20,2,3),zone="zone1"),
  data.frame(poll=rnorm(20,2,1),zone="zone2"))
data2$id="ID2"

#this is the sample full data set
alldata<-rbind(data1,data2)

ids<-unique(alldata$id)

for (i in ids) {
  graphdata<-subset(alldata, id==i)

  p<-ggplot(graphdata, aes(x=zone, y=poll)) + 
    geom_dotplot(binaxis='y', stackdir='center', binwidth=0.8, 
                 method="histodot",stackratio=0.8, dotsize=0.5) +
    ggtitle(i)

  fname<-paste(outdir,"/",i,".png",sep="")
  ggsave(fname,last_plot())
}

推荐答案

虽然 geom_dotplot 看起来像点图,但实际上是直方图的另一种表示形式.如果我们查看?geom_dotplot ,我们会发现点的大小不是绝对大小,而是基于相对于x轴或y轴(如适当):

While geom_dotplot looks like a dot plot, it's actually a different representation of a histogram. If we look at ?geom_dotplot, we see that the the size of the dots is not an absolute size, but is based on the width of the bins relative to the x-axis or y-axis (as appropriate):

在点图中,点的宽度对应于容器宽度...

In a dot plot, the width of a dot corresponds to the bin width ...

dotsize 参数(与您预期的相反)只是按相对因子缩放点的大小:

And the dotsize argument (contrary to what you might expect) just scales the size of the dots by a relative factor:

dotsize:相对于binwidth的点的直径,默认为1.

dotsize: The diameter of the dots relative to binwidth, default 1.

我们可以通过一个示例看到这一点:

We can see this with an example:

ggplot(mtcars, aes(x = mpg)) +
  geom_dotplot(binwidth = 1.5, stackdir = "center")

通过将x轴缩放三倍,同时保持 binwidth 不变,我们减小了这些bin相对于轴的相对大小,并且点缩小了:

By scaling the x-axis by three while keeping binwidth constant, we reduce the relative size of these bins relative to the axis and the dots shrink:

ggplot(mtcars, aes(x = mpg*3)) +
  geom_dotplot(binwidth = 1.5, stackdir = "center")

如果将 binwidth 的大小乘以三,则垃圾箱的相对大小相同,点的大小与第一个示例相同:

If we multiply the size of the binwidth by three, the relative size of the bins is the same and the dots are the same size as the first example:

ggplot(mtcars, aes(x = mpg*3)) +
  geom_dotplot(binwidth = 4.5, stackdir = "center")

我们还可以通过设置 dotsize = 3 (从其默认值1开始)进行补偿.这使点的3x变大,从而使它们与第一个示例中的点的大小相匹配,尽管容器相对于轴较小.请注意,它们现在重叠了,因为点大于x轴上占据的空间:

We can also compensate by setting dotsize = 3 (up from its default value of 1). This makes the dots 3x larger so they match the size of the dots in the first example, despite the bins being smaller relative to the axis. Note that they overlap now, since the dots are larger than the space the take up on the x-axis:

ggplot(mtcars, aes(x = mpg*3)) +
  geom_dotplot(binwidth = 1.5, stackdir = "center", dotsize = 3)

如果您希望点的大小相同,则可以为 dotsize 使用动态值来缩放它们.可能有一种更优雅的方法,但是作为一次简单的尝试,我将为所有数据集计算y轴的最大范围:

If you want your dots to be the same size, I'd use a dynamic value for dotsize to scale them. There's probably a more elegant way to do this, but as a simple attempt, I'd calculate the maximum range of the y-axis for all your datasets:

# Put this outside the loop
#   and choose whatever dataset has the largest range
max_y_range <- max(data1$poll) - min(data1$poll)

然后在您的循环中设置:

then in your loop, set:

dotsize = (max(graphdata$poll) - min(graphdata$poll))/max_y_range

随着y轴在图之间的变化,这应该正确缩放点:

This should scale your dots properly as the y-axis changes between plots:

这篇关于循环绘制不同的数据集时,geom_dotplot点大小发生变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆