如何在折线图上添加注释以标记离散x值之间y值的百分比变化 [英] How to add annotation over line plot to mark percent change in y-values between discrete x-values

查看:69
本文介绍了如何在折线图上添加注释以标记离散x值之间y值的百分比变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想可视化线性模型的结果,其中因变量值随离散x值的变化而变化.由于我的x值代表连续的几天,因此我想用百分比注释每天的更改.如何在线条图中执行此操作?

I want to visualize the results of a linear model where dependent variable values change as a function of discrete x-values. Since my x-values represent consecutive days, I want to annotate the change from day to day, in percents. How can I do this in a line plot?

library(tidyverse)
library(emmeans)

day_1 <- rnorm(1000, mean = 77, sd = 18)
day_2 <- rnorm(1000, mean = 74, sd = 19)
day_3 <- rnorm(1000, mean = 80, sd = 5)
day_4 <- rnorm(1000, mean = 76, sd = 18)


df <- 
  cbind(day_1, day_2, day_3, day_4) %>%
  as.tibble() %>%
  gather(., key = day, value = mood, day_1:day_4) %>%
  mutate_at(vars(day), factor)

> df

## # A tibble: 4,000 x 2
##   day    mood
##    <fct> <dbl>
##  1 day_1  83.9
##  2 day_1  94.9
##  3 day_1 104. 
##  4 day_1  81.0
##  5 day_1  61.4
##  6 day_1  95.1
##  7 day_1  78.6
##  8 day_1 108. 
##  9 day_1  74.7
## 10 day_1  79.7
## # ... with 3,990 more rows

拟合和绘制

fit <-  lm(formula = mood ~ day, data = df)

emmip(fit, ~ day, CIs = TRUE)

  • 鉴于可以使用ggplot函数编辑绘图对象,如何添加天数之间的变化(以百分比为单位),如下图所示?

是否有一种有效的方法来计算更改并将其放在行的每个部分上方?

Is there an efficient way to calculate the change and put it above each section of the line?

推荐答案

以下方法利用 ggplot_build()(包含在 ggplot2 本身中)提取基础数据用于创建图,然后使用 geom_label()来执行注释本身.

The following approach utilizes ggplot_build() (included from ggplot2 itself) to pull out the underlying data used to create your plot, then geom_label() to perform the annotation itself.

如前所述,我们可以使用 ggplot_build()从您的数据集中提取数据.

As indicated, we can use ggplot_build() to pull the data from your dataset.

p <- emmip(fit, ~ day, CIs = TRUE)  # save your plot as gg object
plotdata <- ggplot_build(p)$data[[1]]

ggplot_build()函数中发生了很多事情,所以我将进行解释.我们要访问结果的 data 部分,然后执行此操作,您将获得用于创建每个图层的数据集.在图中,您具有3层:CI的点,线和条.原则上,您可以拉任何一个,但我选择第一个( [[[1]] ).特别是,我们要访问 y 值.

There's kind of a lot going on there in the ggplot_build() function, so I'll explain. We want to access the data part of the result, and when you do that you get the datasets used to create each of the layers. In the plot, you have 3 layers: the points, the lines, and the bars for the CI. In principle, you can pull any of those, but I'm choosing the first one ([[1]]). In particular, we want to access the y values.

要计算百分比变化,我编写了一个小函数使用 diff()为我们完成此操作.由于 diff()不会返回"0",因此对于第一个索引,我们必须添加它.然后,将列添加到 plotdata :

To calculate the percent change, I have written a small function to do this for us that uses diff(). Since diff() does not return a "0" for the first index, we have to add that. Then we add the column to plotdata:

percent_change <- function(x) {
  p_change <- (diff(x)/x[1:length(x)-1])*100
  return(c(0,p_change))  # add back the 0 for the first index
}

plotdata$change <- percent_change(plotdata$y)

绘图

现在我们可以开始剧情了.我们将在标注中添加标签geom, p .那里发生了一些事情:

Plotting

Now we're ready for the plot. We'll add a label geom to the plot, p. There's a few things going on in there:

  • 仅使用 plotdata $ change!= 0 plotdata 部分进行过滤.这是因为我们不想标记没有变化的任何点(即第一个点).

  • Filtering to use only the plotdata parts where plotdata$change != 0. This is because we don't want to label any points where there is no change (i.e. the first point).

我需要添加一个"+"号 plotdata $ change 的前面的正值.标签美学中的 ifelse()似乎很好用.

I need to add a "+" preceding positive values of plotdata$change. ifelse() within the label aesthetic seems to work just fine.

颜色可以在此处动态更改.您也可以通过 aes()对其进行映射,但是我需要创建另一列,因此在这里使用 ifelse()将颜色控制为红色或红色非常方便.绿色,因为只有两种选择.您必须在 aes()之外执行此操作,否则,对于"red"标签,您只会得到图例和默认的 ggplot2 颜色.和绿色".我在这里做的方式都没有创建图例.

color can be dynamically changed here. You could also map it via aes(), but I would need to create another column and so it's just convenient here to use ifelse() to control the color as red or green, since there's only two options. You have to do this outside of the aes(), otherwise you will only get a legend and default ggplot2 colors for the labels "red" and "green". No legend is created the way I do it here.

代码和绘图在这里:

p + geom_label(
  data=subset(plotdata, change != 0),
  aes(x=x, y=y,
    label=paste0(ifelse(
      subset(plotdata, change!=0)$change <0, '','+'),
      round(change, 2),'%')),
  color=ifelse(subset(plotdata, change!=0)$change <0, 'red','green3'),
  nudge_x = -0.3
)

这篇关于如何在折线图上添加注释以标记离散x值之间y值的百分比变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆