次要x轴标签,用于在R上带有ggplot2的样本量 [英] Secondary x-axis labels for sample size with ggplot2 on R

查看:181
本文介绍了次要x轴标签,用于在R上带有ggplot2的样本量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用R中的ggplot2复制以下在Excel中创建的图形:

我能够使用geom_line(),带有geom_text()的标签/一些手动调整以及y轴成功创建线条.在下面找到我的代码:

 库(readxl)库(ggplot2)somedata<-read_excel("somedata.xlsx")somedata [,c(3:6)]<-somedata [,c(3:6)] * 100somedata [,6]<-somedata [,6] + 25somegraph<-ggplot(data = somedata,aes(x = date))somegraph +geom_point(aes(y =合格的主),形状= 15,大小= 4)+geom_line(aes(y =条件_主要))+geom_line(aes(y =合格的中心),大小= 2)+geom_line(aes(y =合格_upper),线型= 2)+geom_line(aes(y =合格的下限),线型= 2)+scale_y_continuous(限制= c(0,100),中断= seq(0,100,10),标签= paste0(seq(0,100,10),%"))+labs(title ="Title",x ="Time",y ="Percentage")+ theme_classic()+主题(plot.title = element_text(高度= 0.5),axis.text.x = element_text(角度= 90,高度= 1)) 

作为参考,以下是一些我伪造的数据,它们以与我正在绘制图形的数据类似的格式创建.您可以将其粘贴到R中并在控制台中运行我的代码:

请注意,我的代码与您的代码有些不同.文本标签通过 ggrepel 包被推开.我还使用了 scales 中的一些函数来修复和设置日期轴的格式(还请注意 lubridate 是用于在上面的虚拟数据集中创建日期的软件包).否则,那里会有相当标准的 ggplot 东西.

对于轴外的文本,最好的方法是通过自定义注释,您必须在其中设置grob.这里的方法如下:

  • 将轴向下"移动以留出多余的空间.我们可以通过在轴标题上方设置边距来实现.

  • 通过 coord_cartesian(clip ='off')关闭剪辑.为了通过允许在绘图区域之外绘制事物来在绘图外部查看注解,这是必需的.

  • 遍历 df $ n 的值,以创建单独的 annotation_custom 对象,该对象通过 for 循环添加到绘图中

代码如下:

  p<-p.基本+主题(axis.title.x = element_text(margin = margin(50,0,0,0)))+coord_cartesian(clip ='off')为(i in 1:length(df $ n)){p<-p +注解自定义(textGrob(label = paste0('n =',df $ n [i]),rot = 90,gp = gpar(fontsize = 9)),xmin = df $ dates [i],xmax = df $ dates [i],ymin = -25,ymax = -15)}p 

高级选项带来更多乐趣

还要添加两件事:注释(如特定点的标注+文本),以及轴标签之间的绘图下方的线.

对于轴下方的线:您可以通过 scale _... 轻松地将 breaks = 添加到其他轴breaks = 参数;但是,对于日期轴来说,这很复杂.这就是为什么我们仅使用与上述相同的方法为轴下方的文本添加行的原因.这里的想法是在下面的代码中将轴分成 sub.div 段,该段基于x轴中有多少个离散值.我可以内联执行几次...但是首先创建变量很有趣:

  sub.div<-1/长度(df $ n) 

然后,我通过使用 for 循环再次注释沿着步骤 sub.div * i 的线条来创建线条:

  for(i in 1:(length(df $ n)-1)){p<-p +注解自定义(linesGrob(x = unit(c(sub.div * i,sub.div * i),'npc'),y = unit(c(0,-0.2),'npc')#轴下方线的长度))} 

我意识到我在这里没有结尾,但是您可能会发现通过修改上面的方法来添加它很容易.

注释(带有箭头,为什么不呢?):有很多方法可以做注释.使用 annotate()函数

最后一件事:水平线我在您的代码中注意到的最后一件事,但忘了指出的是,使用 geom_hline 来制作水平线.这要容易得多.另外,您可以很容易地通过两次调用 geom_hline 来做到这一点(如果您希望将数据帧传递给函数,甚至只需一次调用即可):

  p<-p + geom_hline(yintercept = 50,size = 2,color ='gray30')+geom_hline(yintercept = c(25,75),linetype = 2,color ='gray30') 

请注意,建议将这两个 geom_hline 调用添加到 之前 geom_line geom_point 在原始 p.basic 情节中,因此它们位于其他所有内容之后.

I am trying to replicate the following graph, which I made in Excel, with ggplot2 in R:

I am able to successfully create the lines with geom_line(), the labels with geom_text() / some manual adjustments, and the y-axis. Find my code below:

library(readxl)
library(ggplot2)

somedata <- read_excel("somedata.xlsx")
somedata[,c(3:6)] <- somedata[,c(3:6)] * 100
somedata[,6] <- somedata[,6] + 25 

somegraph <- ggplot(data = somedata, aes(x = date))
somegraph + 
  geom_point(aes(y = eligible_main), shape = 15, size = 4) + 
  geom_line(aes(y = eligible_main)) + 
  geom_line(aes(y = eligible_center), size = 2) + 
  geom_line(aes(y = eligible_upper), linetype = 2) + 
  geom_line(aes(y = eligible_lower), linetype = 2) + 
  scale_y_continuous(limits = c(0, 100), breaks = seq(0, 100, 10), labels = paste0(seq(0, 100, 10), "%")) + 
  labs(title = "Title", x = "Time", y = "Percentage") + theme_classic() + 
  theme(plot.title = element_text(hjust = 0.5), axis.text.x = element_text(angle = 90, hjust = 1))

For reference, here is some fake data I created in a similar format as the data I am graphing. You can paste it into R and run my code above in the console: http://s000.tinyupload.com/?file_id=43876540434394267818

But I am finding it extremely difficult to create the sample-size secondary labels on the x-axis in the same orientation. Is there a simple solution to this in ggplot2, or with another package? Also, can I add annotation lines onto my graph to point things out once finished?

Thank you very much!

解决方案

I have a solution that might help. I was unable to grab the data you shared, I created my own dummy dataset as follows:

set.seed(12345)
library(lubridate)

df <- data.frame(
  dates=as.Date('2020-03-01')+days(0:9),
  y_vals=rnorm(10, 50,7),
  n=100
)

First, the basic plot:

library(scales)
library(ggrepel)    
p.basic <- ggplot(df, aes(dates, y_vals)) +
        geom_line() +
        geom_point(size=2.5, shape=15) +
        geom_text_repel(
            aes(label=paste0(round(y_vals, 1), '%')),
            size=3, direction='y', force=7) +
        ylim(0,100) +
        scale_x_date(breaks=date_breaks('day'), labels=date_format('%b %d')) +
        theme_bw()

Note that my code is a bit different than your own. Text labels are pushed away via the ggrepel package. I'm also using some functions from scales to fix and set formatting of the date axis (note also lubridate is the package used to create the dates in the dummy dataset above). Otherwise, pretty standard ggplot stuff there.

For the text outside the axis, the best way to do this is through a custom annotation, where you have to setup the grob. The approach here is as follows:

  • Move the axis "down" to allow room for the extra text. We do that via setting a margin on top of the axis title.

  • Turn off clipping via coord_cartesian(clip='off'). This is needed in order to see the annotations outside of the plot by allowing things to be drawn outside the plot area.

  • Loop through the values of df$n, to create a separate annotation_custom object added to the plot via a for loop.

Here's the code:

p <- p.basic +
    theme(axis.title.x = element_text(margin=margin(50,0,0,0))) +
    coord_cartesian(clip='off')

for (i in 1:length(df$n)) {
  p <- p + annotation_custom(
    textGrob(
      label=paste0('n=',df$n[i]), rot=90, gp=gpar(fontsize=9)),
      xmin=df$dates[i], xmax=df$dates[i], ymin=-25, ymax=-15
    )
}
p

Advanced Options for more Fun

Two more things to add: Annotations (like callouts for specific points + text), and the lines below the plot in between the axis label stuff.

For lines below the axis: You can add breaks= to other axes fairly easily via scale_... and the breaks= parameter; however, for a date axis, it's... complicated. This is why we will just add lines using the same method as above for the text below the axis. The idea here is to break the axis into sub.div segments in the code below, which is based on how many discrete values are in your x axis. I could do this in-line a few times... but it's fun to create the variable first:

sub.div <- 1/length(df$n)

Then, I use that to create the lines by annotating individually the lines along the step sub.div*i using a for loop again:

for (i in 1:(length(df$n)-1)) {
  p <- p + annotation_custom(
    linesGrob(
      x=unit(c(sub.div*i,sub.div*i), 'npc'),
      y=unit(c(0,-0.2), 'npc')   # length of the line below axis
    )
  )
}

I realize I don't have the lines on the ends here, but you can probably see how it would be easy to add that by modifying the method above.

Annotations (with arrows, why not?): There are lots of ways to do annotations. Some are covered here using the annotate() function. As well as here. You can use annotate() if you wish, but in this example, I'm just going to use geom_label for the text labels and geom_curve to make some curvy arrows.

You can manually pass individual aes() values through the call to both functions for each annotation. For example, geom_text(aes(x=as.Date('2020-03-01'), y=55,..., but if you have a few in your dataset, it would be advisable to set the annotations based on information within the dataframe itself. I'll do that here first, where we will label two of the points:

df$notes <- c('','','','Wow!','','','OMG!!!','','','')

You can use the value of df$notes to indicate which of the points are getting labeled, and also take advantage of the mapping of x and y values within the same dataset.

Then you just need to add the two geoms to your plot, modifying as you wish to fit your own aesthetics.

p <- p + geom_curve(
    data=df[which(df$notes!=''),],
    mapping=aes(x=dates+0.5, xend=dates, y=y_vals+20, yend=y_vals+2),
    color='red', curvature = 0.5,
    arrow=arrow(length=unit(5,'pt'))
  ) +
  geom_label(
    data=df[which(df$notes!=''),],
    aes(y=y_vals+20, label=notes),
    size=4, color='red', hjust=0
  )

Final thing: Horizontal Lines One final thing that I noticed in your code before, but forgot to point out is that to make your horizontal lines, just use geom_hline. It's much easier. Also, you can do it in two calls to geom_hline pretty easily (and even in just one call if you care to pass a dataframe to the function):

p <- p + geom_hline(yintercept = 50, size=2, color='gray30') +
  geom_hline(yintercept = c(25,75), linetype=2, color='gray30')

Just note that it's advisable to add these two geom_hline calls before geom_line or geom_point in the original p.basic plot so they are behind everything else.

这篇关于次要x轴标签,用于在R上带有ggplot2的样本量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆