position_dodge 中的宽度参数是什么? [英] What is the width argument in position_dodge?

查看:31
本文介绍了position_dodge 中的宽度参数是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

那么 (1) widthposition_dodge 中是谁,以及 (2) 单位是什么?

?position_dodge中我们可以读到:

<块引用>

width:躲避宽度,当不同于单个元素的宽度

因此,如果我们使用默认width,即NULL,躲避计算是基于个体的宽度元素.

因此,对于您的第一个问题它指定了谁的宽度?"的简单回答是:单个元素的宽度.

但我们当然想知道,单个元素的宽度"是多少?让我们从酒吧开始.来自 ?geom_bar:

<块引用>

width:条形宽度.默认设置为数据分辨率的90%

出现了一个新问题:什么是分辨率?让我们检查 ?ggplot2::resolution:

<块引用>

分辨率是相邻值之间的最小非零距离.如果只有一个唯一值 [如我们的示例中],则分辨率定义为 1.

我们尝试:

分辨率(df$x)# [1] 1

因此,本例中的默认条宽为 0.9 * 1 = 0.9

我们可以通过查看 ggplot 使用 ggplot_build 渲染图上的条形的数据来检查这一点.我们创建了一个带有堆叠条形图的绘图对象,条形图具有默认宽度.

p2 <- p +geom_bar(stat = "身份",位置=堆栈")

对象中的相关槽是$data,它是一个列表,图中的每一层都有一个元素,与它们在代码中出现的顺序相同.在这个例子中,我们只有一层,即geom_bar,所以让我们看看第一个槽:

ggplot_build(p2)$data[[1]]# 填充 x y 标签面板组 ymin ymax xmin xmax 颜色大小线型 alpha# 1 #F8766D 1 1 A 1 1 0 1 0.55 1.45 NA 0.5 1 NA# 2 #00BFC4 1 2 B 1 2 1 2 0.55 1.45 NA 0.5 1 NA

每行包含用于绘制"单个条形的数据.如您所见,条形的宽度均为 0.9 (xmax - xmin = 0.9).因此,用于计算新躲避位置和宽度的 堆叠 条的宽度为 0.9.


在前面的示例中,我们使用了默认的条宽以及默认的减淡宽度.现在让我们使条形比上面的默认宽度 (0.9) 稍宽.使用 geom_bar 中的 width 参数显式设置(堆叠的)条宽为例如 1.我们尝试使用与上面相同的减淡宽度(position_dodge(width = 0.9)).因此,虽然我们将 实际 条形宽度设置为 1,但进行闪避计算时就好像条形宽度为 0.9.让我们看看会发生什么:

p +geom_bar(stat = "identity", width = 1, position = position_dodge(width = 0.9), alpha = 0.8)p

条形重叠,因为 ggplot 水平移动条形就好像它们的(堆叠)宽度为 0.9(set in position_dodge),而实际上条形的宽度为 1(set in geom_bar).

如果我们使用默认闪避值,条形会根据设置条形宽度准确地水平移动:

p +geom_bar(stat = "identity", width = 1, position = "dodge", alpha = 0.8)# 或: position = position_dodge(width = NULL)


接下来,我们尝试使用 geom_text 将一些文本添加到我们的绘图中.我们从默认的躲避width(即position_dodge(width = NULL))开始,即躲避是基于默认的元素大小.

p <- ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) + theme_minimal()p2 <- p +geom_bar(stat = "identity", position = position_dodge(width = NULL)) +geom_text(size = 10, position = position_dodge(width = NULL))# 或位置 = 躲闪"p2# 警告信息:# 宽度未定义.用 `position_dodge(width = ?)` 设置

文字闪避失败.警告信息呢?宽度未定义?".有点神秘.我们需要查阅?geom_text的详细信息部分:

<块引用>

注意宽度"和高度"一个文本元素的值为 0,所以默认情况下堆叠和躲避文本将不起作用,[...]显然,标签确实有高度和宽度,但它们是物理单位,不是数据单位.

所以对于geom_text单个元素的宽度为零.这也是您的第二个问题的第一个官方 ggplot 参考":width 的单位是数据单位.

让我们看看用于在绘图上呈现文本元素的数据:

ggplot_build(p3)$data[[2]]# 填充 x y 标签 PANEL 组 xmin xmax ymax 颜色大小角度 hjust vjust vjust alpha family fontface lineheight# 1 #F8766D 1 1 A 1 1 1 1 1 黑色 10 0 0.5 0.5 NA 1 1.2# 2 #00BFC4 1 1 B 1 2 1 1 1 黑色 10 0 0.5 0.5 NA 1 1.2

确实,xmin == xmax;因此,文本元素的宽度数据单位为零.

如何实现宽度为零的文本元素的正确躲避?来自 ?geom_text 中的示例:

<块引用>

ggplot2 不知道您想为标签提供与条形相同的虚拟宽度 [...] 所以告诉它:

因此,当计算新位置时,为了使 geom_text 元素与 geom_bar 元素的 dodge 使用相同的宽度,我们需要 set数据单位中的虚拟躲避宽度"将文本元素的宽度设置为与条形相同的宽度.我们使用 position_dodgewidth 参数来设置文本元素的虚拟宽度为 0.9(即上例中的条宽):

p2 <- p +geom_bar(stat = "identity", position = position_dodge(width = NULL)) +geom_text(position = position_dodge(width = 0.9), size = 10)

检查用于渲染geom_text的数据:

ggplot_build(p2)$data[[2]]# 填充 x y 标签 PANEL 组 xmin xmax ymax 颜色大小角度 hjust vjust vjust alpha family fontface lineheight# 1 #F8766D 0.775 1 A 1 1 0.55 1.00 1 黑色 10 0 0.5 0.5 NA 1 1.2# 2 #00BFC4 1.225 1 B 1 2 1.00 1.45 1 黑色 10 0 0.5 0.5 NA 1 1.2

现在文本元素具有数据单位的宽度:xmax - xmin = 0.9,即与条形相同的宽度.因此,现在将进行闪避计算就好像文本元素具有一定的宽度,这里是 0.9.渲染图:

p2

文字躲避正确!


与文本类似,点(geom_point)和误差线(例如geom_errorbar)的数据单位宽度为零.因此,如果您需要躲避此类元素,则需要指定相关的虚拟宽度,然后基于该虚拟宽度进行闪避计算.见例如?geom_errorbar 的示例部分:

<块引用>

如果要躲避条形和误差条,则需要手动指定躲避宽度[...] 因为条形和误差条的宽度不同,我们需要指定我们躲避的对象的宽度


这是一个在连续尺度上有多个 x 值的例子:

df <- data.frame(x = rep(c(10, 20, 50), each = 2),y = 1,grp = c(A", B"))

假设我们希望创建一个躲避条形图,每个条形上方都有一些文本.首先,仅使用默认的躲避宽度检查条形图:

p <- ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) + theme_minimal()p +geom_bar(stat = "identity", position = position_dodge(width = NULL))# 或位置 = 躲闪"

它按预期工作.然后,添加文本.我们尝试将文本元素的虚拟宽度设置为与上面示例中的条形宽度相同,即我们猜测"条形仍然具有 0.9 的宽度,并且我们需要躲避文本元素就好像它们也具有 0.9 的宽度:

p +geom_bar(stat = "identity", position = "dodge") +geom_text(position = position_dodge(width = 0.9), size = 10)

显然,条形的躲避计算现在基于与 0.9 不同的宽度,将文本元素的虚拟宽度设置为 0.9 是一个错误的猜测.那么这里的 条宽是多少?同样,条宽是[b]y 默认值,设置为数据分辨率的 90%".检查分辨率:

分辨率(df$x)# [1] 10

因此,计算新的躲避位置的(默认堆叠)条的宽度现在为 0.9 * 10 = 9.因此,要躲避条形及其相应的手拉手"文本,我们还需要将文本元素的虚拟宽度设置为 9:

p +geom_bar(stat = "identity", position = "dodge") +geom_text(position = position_dodge(width = 9), size = 10)


在我们的最后一个示例中,我们有一个分类 x 轴,只是上面 x 值的因子版本".

df <- data.frame(x = factor(rep(c(10, 20, 50), each = 2)),y = 1,grp = c(A", B"))

在 R 中,因子在内部是一组具有级别"的整数代码.属性.从 ?resolution:

<块引用>

如果x是一个整数向量,那么假设它代表一个离散变量,分辨率为1.

现在,我们知道当 resolution 为 1 时,条形的默认宽度为 0.9.因此,在分类 x 轴上,geom_bar 的默认宽度为 0.9,我们需要相应地为 geom_text 设置躲避 width:

ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) +theme_minimal() +geom_bar(stat = "identity", position = "dodge") +# 或: position = position_dodge(width = NULL)# 或: position = position_dodge(width = 0.9)geom_text(position = position_dodge(width = 0.9), size = 10)

The documentation of position_dodge does not explain what exactly is this width argument

  1. Whose width does it specify?
  2. What's the "unit"?
  3. What's the default value?

The default value is width = NULL, but trial and error shows that width = 0.9 seems to produce the default effect (see postscript). However, I couldn't find where such default value is set in ggplot2 source code. Thus,

  1. Could you explain how the default dodge is implemented in ggplot2 code?

The spirit of the question is to allow ggplot2 users to find appropriate width values without trial and error. PS:

ggplot(data = df) +
  geom_bar(aes(x, y, fill = factor(group)), 
           position = position_dodge(), stat = "identity")

ggplot(data = df) +
  geom_bar(aes(x, y, fill = factor(group)), 
           position = position_dodge(0.9), stat = "identity")

解决方案

I will first give very brief answers to your three main questions. Then I walk through several examples to illustrate the answers more thoroughly.

  1. Whose width does it specify?
    The width of the geom elements to be dodged.

  2. What's the "unit"?
    The actual or the virtual width in data units of the elements to be dodged.

  3. What's the default value?
    If you don't set the dodging width explicitly, but rely on the default value, position_dodge(width = NULL) (or just position = "dodge"), the dodge width which is used is the actual width in data units of the element to be dodged.

I believe your fourth question is too broad for SO. Please refer to the code of collide and dodge and, if needed, ask a new, more specific question.


Based on the dodge width of the element (together with its original horizontal position and the number of elements which are stacked), new center positions (x) of each element, and new widths (xmin, xmax positions) are calculated. The elements are shifted horizontally just far enough not to overlap with adjacent elements. Obviously, wide elements needs to be shifted more than narrow elements in order to avoid overlap.

To get a better feeling for dodging in general and the use of the width argument in particular, I show some examples. We start with a simple dodged bar plot, with default dodging; we can use either position = "dodge" or the more explicit position = position_dodge(width = NULL)

# some toy data
df <- data.frame(x = 1,
                 y = 1,
                 grp = c("A", "B"))

p <- ggplot(data = df, aes(x = x, y = y, fill = grp)) + theme_minimal()
p + geom_bar(stat = "identity",
             position = "dodge")
           # which is the same as:
           # position = position_dodge(width = NULL))

So (1) who's width is it in position_dodge and (2) what is the unit?

In ?position_dodge we can read:

width: Dodging width, when different to the width of the individual elements

Thus, if we use the default width, i.e. NULL, the dodging calculations are based on the width of the individual elements.

So a trivial answer to your first question, "Whose width does it specify?, would be: the width of the individual elements.

But of course we then wonder, what is "the width of the individual elements"? Let's start with the bars. From ?geom_bar:

width: Bar width. By default, set to 90% of the resolution of the data

A new question arises: what is resolution? Let's check ?ggplot2::resolution:

The resolution is is the smallest non-zero distance between adjacent values. If there is only one unique value [like in our example], then the resolution is defined to be one.

We try:

resolution(df$x)
# [1] 1

Thus, the default bar width in this example is 0.9 * 1 = 0.9

We may check this by looking at the data ggplot uses to render the bars on the plot using ggplot_build. We create a plot object with a stacked barplot, with bars of default width.

p2 <- p +
  geom_bar(stat = "identity",
           position = "stack")

The relevant slot in the object is $data, which is a list with one element for each layer in the plot, in the same order as they appear in the code. In this example, we only have one layer, i.e. geom_bar, so let's look at the first slot:

ggplot_build(p2)$data[[1]]

#      fill x y label PANEL group ymin ymax xmin xmax colour size linetype alpha
# 1 #F8766D 1 1     A     1     1    0    1 0.55 1.45     NA  0.5        1    NA
# 2 #00BFC4 1 2     B     1     2    1    2 0.55 1.45     NA  0.5        1    NA

Each row contains data to 'draw' a single bar. As you can see, the width of the bars are all 0.9 (xmax - xmin = 0.9). Thus, the width of the stacked bars, to be used in the calculations of the new dodged positions and widths, is 0.9.


In the previous example, we used the default bar width, together with the default dodge width. Now let's make the bar slightly wider than the default width above (0.9). Use the width argument in geom_bar to explicitly set the (stacked) bar width to e.g 1. We try to use the same dodge width as above (position_dodge(width = 0.9)). Thus, while we have set the actual bar width to be 1, the dodge calculations are made as if the bars are 0.9 wide. Let's see what happens:

p +
  geom_bar(stat = "identity", width = 1, position = position_dodge(width = 0.9), alpha = 0.8)
p

The bars are overlapping because ggplot shifts bars horizontally as if they have a (stacked) width of 0.9 (set in position_dodge), while in fact the bars have a width of 1 (set in geom_bar).

If we use the default dodge values, the bars are shifted horizontally accurately according to the set bar width:

p +
  geom_bar(stat = "identity", width = 1, position = "dodge", alpha = 0.8)
                                   # or: position = position_dodge(width = NULL)


Next we try to add some text to our plot using geom_text. We start with the default dodging width (i.e. position_dodge(width = NULL)), i.e. dodging is based on default element size.

p <- ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) + theme_minimal()
p2 <- p +
  geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
  geom_text(size = 10, position = position_dodge(width = NULL))
                  # or position = "dodge"    

p2
# Warning message:
#  Width not defined. Set with `position_dodge(width = ?)`

The dodging of the text fails. What about the warning message? "Width is not defined?". Slightly cryptic. We need to consult the Details section of ?geom_text:

Note the the "width" and "height" of a text element are 0, so stacking and dodging text will not work by default, [...] Obviously, labels do have height and width, but they are physical units, not data units.

So for geom_text, the width of the individual elements is zero. This is also the first 'official ggplot reference' to your second question: The unit of width is in data units.

Let's look at the data used to render the text elements on the plot:

ggplot_build(p3)$data[[2]]
#       fill x y label PANEL group xmin xmax ymax colour size angle hjust vjust alpha family fontface lineheight
# 1 #F8766D 1 1     A     1     1    1    1    1  black   10     0   0.5   0.5    NA               1        1.2
# 2 #00BFC4 1 1     B     1     2    1    1    1  black   10     0   0.5   0.5    NA               1        1.2

Indeed, xmin == xmax; Thus, the width of the text element in data units is zero.

How to achieve correct dodging of the text element with width zero? From Examples in ?geom_text:

ggplot2 doesn't know you want to give the labels the same virtual width as the bars [...] So tell it:

Thus, in order for dodge to use the same width for geom_text elements as for the geom_bar elements when new positions are calculated, we need to set "the virtual dodging width in data units" of the text element to the same width as the bars. We use the width argument of position_dodge to set the virtual width of the text element to 0.9 (i.e. the bar width in the example above):

p2 <- p +
  geom_bar(stat = "identity", position = position_dodge(width = NULL)) +
  geom_text(position = position_dodge(width = 0.9), size = 10)

Check the data used for rendering geom_text:

ggplot_build(p2)$data[[2]]
#      fill     x y label PANEL group xmin xmax ymax colour size angle hjust vjust alpha family fontface lineheight
# 1 #F8766D 0.775 1     A     1     1 0.55 1.00    1  black   10     0   0.5   0.5    NA               1        1.2
# 2 #00BFC4 1.225 1     B     1     2 1.00 1.45    1  black   10     0   0.5   0.5    NA               1        1.2

Now the text elements have a width in data units: xmax - xmin = 0.9, i.e. the same width as the bars. Thus, the dodge calculations will now be made as if the text elements have a certain width, here 0.9. Render the plot:

p2

The text is dodged correctly!


Similar to text, the width in data units of points (geom_point) and error bars (e.g. geom_errorbar) is zero. Thus, if you need to dodge such elements, you need to specify a relevant virtual width, on which dodge calculations then are based. See e.g. the Example section of ?geom_errorbar:

If you want to dodge bars and errorbars, you need to manually specify the dodge width [...] Because the bars and errorbars have different widths we need to specify how wide the objects we are dodging are


Here is an example with several x values on a continuous scale:

df <- data.frame(x = rep(c(10, 20, 50), each = 2),
                 y = 1,
                 grp = c("A", "B"))

Let's say we wish to create a dodged barplot with some text above each bar. First, just check a barplot only using the default dodging width:

p <- ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) + theme_minimal()

p + 
  geom_bar(stat = "identity", position = position_dodge(width = NULL))
                         # or position = "dodge"

It works as expected. Then, add the text. We try to set the virtual width of the text element to the same as the width of the bars in the example above, i.e. we "guess" that the bars still have width of 0.9, and that we need to dodge the text elements as if they have a width of 0.9 as well:

p +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(position = position_dodge(width = 0.9), size = 10)

Clearly, the dodging calculation for the bars is now based on a different width than 0.9 and setting the virtual width to 0.9 for the text element was a bad guess. So what is bar width here? Again, bar width is "[b]y default, set to 90% of the resolution of the data". Check the resolution:

resolution(df$x)
# [1] 10

Thus, the width of the (default stacked) bars, on which their new, dodged position is calculated, is now 0.9 * 10 = 9. Thus, to dodge the bars and their corresponding text 'hand in hand', we need to set the virtual width of also the text elements to 9:

p +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(position = position_dodge(width = 9), size = 10)


In our final example, we have a categorical x axis, just a 'factor version' of the x values from above.

df <- data.frame(x = factor(rep(c(10, 20, 50), each = 2)),
                 y = 1,
                 grp = c("A", "B"))

In R, factors are internally a set of integer codes with a "levels" attribute. And from ?resolution:

If x is an integer vector, then it is assumed to represent a discrete variable, and the resolution is 1.

By now, we know that when resolution is 1, the default width of the bars is 0.9. Thus, on a categorical x axis, the default width for geom_bar is 0.9, and we need to set the dodging width for geom_text accordingly:

ggplot(data = df, aes(x = x, y = y, fill = grp, label = grp)) +
  theme_minimal() +
  geom_bar(stat = "identity", position = "dodge") +
  # or: position = position_dodge(width = NULL)
  # or: position = position_dodge(width = 0.9)
  geom_text(position = position_dodge(width = 0.9), size = 10)

这篇关于position_dodge 中的宽度参数是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆