如何使用ggplot2将x轴从几年更改为几个月 [英] How to change x axis from years to months with ggplot2

查看:709
本文介绍了如何使用ggplot2将x轴从几年更改为几个月的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网络访问时间图,绘制了从2014年到现在的每日流量,看起来像这样:

  ggplot (子集(APRA,Post_Day>2013-12-31),aes(x = Post_Day,y = Page_Views))+ 
geom_line()+
scale_y_continuous(labels = comma)+
ylim(0,50000)



正如你可以看到它不是一个很好的图表,更有意义的是将它按月分解,而不是一天。然而,当我尝试下面的代码:

  ggplot(子集(APRA,Post_Day>2013-12-31),aes (x = Post_Day,y = Page_Views))+ 
geom_line()+
scale_y_continuous(labels =逗号)+
ylim(0,50000)+
scale_x_date(date_breaks = 1个月,minor_breaks =1周,labels = date_format(%B))

我得到这个错误:
$ b


错误:无效的输入:date_trans只与Date类的对象一起工作


blockquote>

日期字段 Post_Day POSIXct Page_Views 是数字。数据如下:

  Post_Title Post_Day Page_查看
标题1 2016-05-15 139
标题2 2016 -05-15 61
标题3 2016-05-15 79
标题4 2016-05-16 125
标题5 2016-05-17 374
标题6 2016-05 -17 39
标题7 2016-05-17 464
标题8 2016-05-17 319
标题9 2016-05-18 84
标题10 2016-05-18 64
标题11 2016-05-19 433
标题12 2016-05-19 418
标题13 2016-05-19 124
标题14 2016-05-19 422

我希望将X轴从每日粒度更改为每月。

解决方案

问题中显示的示例数据集每天有多个数据点。所以,无论如何,它需要每天聚合。对于按日或按月进行汇总,使用 data.table lubridate



创建样本数据



由于没有提供可重复的示例,因此会创建一个样本数据集:

  library(data.table)
n_rows< - 5000L
n_days < - 365L * 3L
set.seed(123L)$ b_b DT < - data.table(Post_Title = paste(Title,1:n_rows),
Post_Day = as.Date(2014-01-01)+ sample(0:n_days,n_rows ,replace = TRUE),
Page_Views = round(abs(rnorm(n_rows,500,200))))[order(Post_Day)]
DT




  Post_Title Post_Day Page_Views 
1:Title 74 2014-01-01 536
2:标题478 2014-01-01 465
3:标题3934 2014-01-01 289
4:标题4136 2014-01-01 555
5:标题740 2014-01-02 442
---
4996:标题1478 2016-12-31 586
4997:标题2251 2016-12-31 467
4998:标题2647 2016-12-31 468
4999:标题3243 2016-12-31 498
5000:标题4302 2016-12-31 309




绘制原始数据



如果没有汇总,数据可以被绘制为

  library(ggplot2)
ggplot(DT)+ aes(Post_Day,Page_Views)+ geom_line()



按天汇总



  ggplot(DT [,。(Page_Views = sum(Page_Views)),by = Post_Day])+ 
aes(Post_Day,Page_Views)+ geom_line()

通过 data.table 按日期汇总的分组参数 $ c>被使用, sum()作为聚合函数。聚合将数据点的数量从5000减少到1087.因此,该图看起来不那么复杂。



按月汇总



  ggplot(DT [,。(Page_Views = (Post_Month,Page_Views)+ geom_line()
<
by =。(Post_Month = lubridate :: floor_date(Post_Day,month))]] +
aes(Post_Month,Page_Views) / code>

为了按月汇总,使用了的分组参数但这次 Post_Day 映射到相应月份的第一天。因此, 2014年3月26日变成了 2014-03-01 Post_Month code>仍然是类 POSIXct 。由此,x轴与日期尺度保持连续。这可避免将 Post_Day 转换为因子时的麻烦,例如2014-03使用格式(Post_Day,%Y-%m),其中x轴将变为离散。


I have a web visits over time chart which plots daily traffic from 2014 until now, and looks like this:

 ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+
   geom_line()+
   scale_y_continuous(labels = comma)+
   ylim(0,50000)

As you can see it's not a great graph, what would make a bit more sense is to break it down by month as opposed to day. However when I try this code:

 ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+
   geom_line()+
   scale_y_continuous(labels = comma)+
   ylim(0,50000)+
   scale_x_date(date_breaks = "1 month", minor_breaks = "1 week", labels = date_format("%B"))

I get this error:

Error: Invalid input: date_trans works with objects of class Date only

The date field Post_Day is POSIXct. Page_Views is numeric. Data looks like:

Post_Title  Post_Day    Page_Views
Title 1     2016-05-15  139
Title 2     2016-05-15  61
Title 3     2016-05-15  79
Title 4     2016-05-16  125
Title 5     2016-05-17  374
Title 6     2016-05-17  39
Title 7     2016-05-17  464
Title 8     2016-05-17  319
Title 9     2016-05-18  84
Title 10    2016-05-18  64
Title 11    2016-05-19  433
Title 12    2016-05-19  418
Title 13    2016-05-19  124
Title 14    2016-05-19  422

I'm looking to change the X axis from a daily granularity into monthly.

解决方案

The sample data set shown in the question has multiple data points per day. So, it needs to be aggregated day-wise anyway. For the aggregation by day or month, data.table and lubridate are used.

Create sample data

As no reproducible example is supplied, a sample data set is created:

library(data.table)
n_rows <- 5000L
n_days <- 365L*3L
set.seed(123L)
DT <- data.table(Post_Title = paste("Title", 1:n_rows),
                 Post_Day = as.Date("2014-01-01") + sample(0:n_days, n_rows, replace = TRUE),
                 Page_Views = round(abs(rnorm(n_rows, 500, 200))))[order(Post_Day)]
DT

      Post_Title   Post_Day Page_Views
   1:   Title 74 2014-01-01        536
   2:  Title 478 2014-01-01        465
   3: Title 3934 2014-01-01        289
   4: Title 4136 2014-01-01        555
   5:  Title 740 2014-01-02        442
  ---                                 
4996: Title 1478 2016-12-31        586
4997: Title 2251 2016-12-31        467
4998: Title 2647 2016-12-31        468
4999: Title 3243 2016-12-31        498
5000: Title 4302 2016-12-31        309

Plot raw data

Without aggregation the data can be plotted by

library(ggplot2)
ggplot(DT) + aes(Post_Day, Page_Views) + geom_line()

Aggregated by day

ggplot(DT[, .(Page_Views = sum(Page_Views)), by = Post_Day]) + 
  aes(Post_Day, Page_Views) + geom_line()

To aggregate day-wise the grouping parameter by of data.table is used and sum() as aggregation function. The aggregation is reducing the number of data points from 5000 to 1087. Hence, the plot looks less convoluted.

Aggregated by month

ggplot(DT[, .(Page_Views = sum(Page_Views)), 
          by = .(Post_Month = lubridate::floor_date(Post_Day, "month"))]) + 
  aes(Post_Month, Page_Views) + geom_line()

In order to aggregate by month, the grouping parameter by is used but this time Post_Day is mapped to the first day of the respective months. So, 2014-03-26 becomes a Post_Month of 2014-03-01 which is still of class POSIXct. By this, the x-axis remains continuous with a date scale. This avoids the trouble when converting Post_Day to factor, e.g, "2014-03" using format(Post_Day, ""%Y-%m"), where the x-axis would become discrete.

这篇关于如何使用ggplot2将x轴从几年更改为几个月的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆