R 上带有 ggplot2 的样本大小的次要 x 轴标签 [英] Secondary x-axis labels for sample size with ggplot2 on R

查看:46
本文介绍了R 上带有 ggplot2 的样本大小的次要 x 轴标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 R 中的 ggplot2 复制我在 Excel 中制作的以下图表:

我能够成功地创建带有 geom_line() 的线条、带有 geom_text() 的标签/一些手动调整以及 y 轴.在下面找到我的代码:

库(readxl)图书馆(ggplot2)somedata <- read_excel("somedata.xlsx")一些数据[,c(3:6)] <- 一些数据[,c(3:6)] * 100一些数据[,6] <- 一些数据[,6] + 25somegraph <- ggplot(data = somedata, aes(x = date))一些图 +几何点(aes(y = 合格主),形状 = 15,大小 = 4)+geom_line(aes(y = 合格_main))+geom_line(aes(y = 合格中心),尺寸 = 2)+geom_line(aes(y = 合格_上),线型 = 2)+geom_line(aes(y = 合格_lower),线型 = 2)+scale_y_continuous(limits = c(0, 100),breaks = seq(0, 100, 10),labels = paste0(seq(0, 100, 10), "%")) +实验室(标题=标题",x=时间",y=百分比")+主题经典()+主题(plot.title = element_text(hjust = 0.5),axis.text.x = element_text(角度= 90,hjust = 1))

作为参考,这里是我创建的一些假数据,格式与我正在绘制的数据类似.您可以将其粘贴到 R 中并在控制台中运行我上面的代码:

请注意,我的代码与您自己的代码有些不同.文本标签通过 ggrepel 包推开.我还使用 scales 中的一些函数来修复和设置日期轴的格式(另请注意 lubridate 是用于在上面的虚拟数据集中创建日期的包).否则,那里有相当标准的 ggplot 东西.

对于轴外的文本,最好的方法是通过自定义注释,您必须在其中设置 grob.这里的做法如下:

  • 将轴向下"移动,以便为额外的文本留出空间.我们通过在轴标题顶部设置边距来做到这一点.

  • 通过 coord_cartesian(clip='off') 关闭剪辑.这是为了通过允许在绘图区域之外绘制事物来查看绘图之外的注释.

  • 循环遍历 df$n 的值,创建一个单独的 annotation_custom 对象,通过 for 循环添加到绘图中.

代码如下:

p <- p.basic +主题(axis.title.x = element_text(margin=margin(50,0,0,0))) +coord_cartesian(clip='off')for (i in 1:length(df$n)) {p <- p + annotation_custom(文本格罗布(标签=paste0('n=',df$n[i]), rot=90, gp=gpar(fontsize=9)),xmin=df$dates[i], xmax=df$dates[i], ymin=-25, ymax=-15)}p

更多乐趣的高级选项

还有两件事要添加:注释(例如特定点 + 文本的标注),以及轴标签内容之间的绘图下方的线条.

对于轴下方的线:您可以通过 scale_... 轻松地将 breaks= 添加到其他轴休息= 参数;但是,对于日期轴,它......很复杂.这就是为什么我们将使用与上述相同的方法为轴下方的文本添加线条.这里的想法是在下面的代码中将轴分解为 sub.div 段,这取决于 x 轴中有多少离散值.我可以在线执行几次...但是首先创建变量很有趣:

sub.div <- 1/length(df$n)

然后,我再次使用 for 循环沿步骤 sub.div*i 单独注释行来创建行:

for (i in 1:(length(df$n)-1)) {p <- p + annotation_custom(线格罗布(x=unit(c(sub.div*i,sub.div*i), 'npc'),y=unit(c(0,-0.2), 'npc') # 轴下线的长度))}

我知道我这里没有结尾的线条,但是您可能会看到通过修改上面的方法来添加它是多么容易.

注释(带箭头,为什么不呢?):有很多方法可以做注释.

最后一件事:水平线我之前在您的代码中注意到的最后一件事,但忘记指出的是,要制作水平线,只需使用 geom_hline.这要容易得多.此外,您可以很容易地通过两次调用 geom_hline 来完成(如果您想将数据帧传递给函数,甚至只需一次调用):

p <- p + geom_hline(yintercept = 50, size=2, color='gray30') +geom_hline(yintercept = c(25,75), linetype=2, color='gray30')

请注意,建议添加这两个 geom_hline 调用 before geom_linegeom_point 在原始 p.basic 图中,因此它们位于其他所有内容的后面.

I am trying to replicate the following graph, which I made in Excel, with ggplot2 in R:

I am able to successfully create the lines with geom_line(), the labels with geom_text() / some manual adjustments, and the y-axis. Find my code below:

library(readxl)
library(ggplot2)

somedata <- read_excel("somedata.xlsx")
somedata[,c(3:6)] <- somedata[,c(3:6)] * 100
somedata[,6] <- somedata[,6] + 25 

somegraph <- ggplot(data = somedata, aes(x = date))
somegraph + 
  geom_point(aes(y = eligible_main), shape = 15, size = 4) + 
  geom_line(aes(y = eligible_main)) + 
  geom_line(aes(y = eligible_center), size = 2) + 
  geom_line(aes(y = eligible_upper), linetype = 2) + 
  geom_line(aes(y = eligible_lower), linetype = 2) + 
  scale_y_continuous(limits = c(0, 100), breaks = seq(0, 100, 10), labels = paste0(seq(0, 100, 10), "%")) + 
  labs(title = "Title", x = "Time", y = "Percentage") + theme_classic() + 
  theme(plot.title = element_text(hjust = 0.5), axis.text.x = element_text(angle = 90, hjust = 1))

For reference, here is some fake data I created in a similar format as the data I am graphing. You can paste it into R and run my code above in the console: http://s000.tinyupload.com/?file_id=43876540434394267818

But I am finding it extremely difficult to create the sample-size secondary labels on the x-axis in the same orientation. Is there a simple solution to this in ggplot2, or with another package? Also, can I add annotation lines onto my graph to point things out once finished?

Thank you very much!

解决方案

I have a solution that might help. I was unable to grab the data you shared, I created my own dummy dataset as follows:

set.seed(12345)
library(lubridate)

df <- data.frame(
  dates=as.Date('2020-03-01')+days(0:9),
  y_vals=rnorm(10, 50,7),
  n=100
)

First, the basic plot:

library(scales)
library(ggrepel)    
p.basic <- ggplot(df, aes(dates, y_vals)) +
        geom_line() +
        geom_point(size=2.5, shape=15) +
        geom_text_repel(
            aes(label=paste0(round(y_vals, 1), '%')),
            size=3, direction='y', force=7) +
        ylim(0,100) +
        scale_x_date(breaks=date_breaks('day'), labels=date_format('%b %d')) +
        theme_bw()

Note that my code is a bit different than your own. Text labels are pushed away via the ggrepel package. I'm also using some functions from scales to fix and set formatting of the date axis (note also lubridate is the package used to create the dates in the dummy dataset above). Otherwise, pretty standard ggplot stuff there.

For the text outside the axis, the best way to do this is through a custom annotation, where you have to setup the grob. The approach here is as follows:

  • Move the axis "down" to allow room for the extra text. We do that via setting a margin on top of the axis title.

  • Turn off clipping via coord_cartesian(clip='off'). This is needed in order to see the annotations outside of the plot by allowing things to be drawn outside the plot area.

  • Loop through the values of df$n, to create a separate annotation_custom object added to the plot via a for loop.

Here's the code:

p <- p.basic +
    theme(axis.title.x = element_text(margin=margin(50,0,0,0))) +
    coord_cartesian(clip='off')

for (i in 1:length(df$n)) {
  p <- p + annotation_custom(
    textGrob(
      label=paste0('n=',df$n[i]), rot=90, gp=gpar(fontsize=9)),
      xmin=df$dates[i], xmax=df$dates[i], ymin=-25, ymax=-15
    )
}
p

Advanced Options for more Fun

Two more things to add: Annotations (like callouts for specific points + text), and the lines below the plot in between the axis label stuff.

For lines below the axis: You can add breaks= to other axes fairly easily via scale_... and the breaks= parameter; however, for a date axis, it's... complicated. This is why we will just add lines using the same method as above for the text below the axis. The idea here is to break the axis into sub.div segments in the code below, which is based on how many discrete values are in your x axis. I could do this in-line a few times... but it's fun to create the variable first:

sub.div <- 1/length(df$n)

Then, I use that to create the lines by annotating individually the lines along the step sub.div*i using a for loop again:

for (i in 1:(length(df$n)-1)) {
  p <- p + annotation_custom(
    linesGrob(
      x=unit(c(sub.div*i,sub.div*i), 'npc'),
      y=unit(c(0,-0.2), 'npc')   # length of the line below axis
    )
  )
}

I realize I don't have the lines on the ends here, but you can probably see how it would be easy to add that by modifying the method above.

Annotations (with arrows, why not?): There are lots of ways to do annotations. Some are covered here using the annotate() function. As well as here. You can use annotate() if you wish, but in this example, I'm just going to use geom_label for the text labels and geom_curve to make some curvy arrows.

You can manually pass individual aes() values through the call to both functions for each annotation. For example, geom_text(aes(x=as.Date('2020-03-01'), y=55,..., but if you have a few in your dataset, it would be advisable to set the annotations based on information within the dataframe itself. I'll do that here first, where we will label two of the points:

df$notes <- c('','','','Wow!','','','OMG!!!','','','')

You can use the value of df$notes to indicate which of the points are getting labeled, and also take advantage of the mapping of x and y values within the same dataset.

Then you just need to add the two geoms to your plot, modifying as you wish to fit your own aesthetics.

p <- p + geom_curve(
    data=df[which(df$notes!=''),],
    mapping=aes(x=dates+0.5, xend=dates, y=y_vals+20, yend=y_vals+2),
    color='red', curvature = 0.5,
    arrow=arrow(length=unit(5,'pt'))
  ) +
  geom_label(
    data=df[which(df$notes!=''),],
    aes(y=y_vals+20, label=notes),
    size=4, color='red', hjust=0
  )

Final thing: Horizontal Lines One final thing that I noticed in your code before, but forgot to point out is that to make your horizontal lines, just use geom_hline. It's much easier. Also, you can do it in two calls to geom_hline pretty easily (and even in just one call if you care to pass a dataframe to the function):

p <- p + geom_hline(yintercept = 50, size=2, color='gray30') +
  geom_hline(yintercept = c(25,75), linetype=2, color='gray30')

Just note that it's advisable to add these two geom_hline calls before geom_line or geom_point in the original p.basic plot so they are behind everything else.

这篇关于R 上带有 ggplot2 的样本大小的次要 x 轴标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆