如何使用ggplot2将x轴从几年更改为几个月 [英] How to change x axis from years to months with ggplot2
问题描述
我有一个网络访问时间图,绘制了从2014年到现在的每日流量,看起来像这样:
ggplot (子集(APRA,Post_Day>2013-12-31),aes(x = Post_Day,y = Page_Views))+
geom_line()+
scale_y_continuous(labels = comma)+
ylim(0,50000)
正如你可以看到它不是一个很好的图表,更有意义的是将它按月分解,而不是一天。然而,当我尝试下面的代码:
ggplot(子集(APRA,Post_Day>2013-12-31),aes (x = Post_Day,y = Page_Views))+
geom_line()+
scale_y_continuous(labels =逗号)+
ylim(0,50000)+
scale_x_date(date_breaks = 1个月,minor_breaks =1周,labels = date_format(%B))
我得到这个错误:
$ b
错误:无效的输入:date_trans只与Date类的对象一起工作
blockquote>
日期字段
Post_Day
是POSIXct
。Page_Views
是数字。数据如下:
Post_Title Post_Day Page_查看
标题1 2016-05-15 139
标题2 2016 -05-15 61
标题3 2016-05-15 79
标题4 2016-05-16 125
标题5 2016-05-17 374
标题6 2016-05 -17 39
标题7 2016-05-17 464
标题8 2016-05-17 319
标题9 2016-05-18 84
标题10 2016-05-18 64
标题11 2016-05-19 433
标题12 2016-05-19 418
标题13 2016-05-19 124
标题14 2016-05-19 422
我希望将X轴从每日粒度更改为每月。
解决方案问题中显示的示例数据集每天有多个数据点。所以,无论如何,它需要每天聚合。对于按日或按月进行汇总,使用
data.table
和lubridate
。
创建样本数据
由于没有提供可重复的示例,因此会创建一个样本数据集:
library(data.table)
n_rows< - 5000L
n_days < - 365L * 3L
set.seed(123L)$ b_b DT < - data.table(Post_Title = paste(Title,1:n_rows),
Post_Day = as.Date(2014-01-01)+ sample(0:n_days,n_rows ,replace = TRUE),
Page_Views = round(abs(rnorm(n_rows,500,200))))[order(Post_Day)]
DT
Post_Title Post_Day Page_Views
1:Title 74 2014-01-01 536
2:标题478 2014-01-01 465
3:标题3934 2014-01-01 289
4:标题4136 2014-01-01 555
5:标题740 2014-01-02 442
---
4996:标题1478 2016-12-31 586
4997:标题2251 2016-12-31 467
4998:标题2647 2016-12-31 468
4999:标题3243 2016-12-31 498
5000:标题4302 2016-12-31 309
绘制原始数据
如果没有汇总,数据可以被绘制为
library(ggplot2)
ggplot(DT)+ aes(Post_Day,Page_Views)+ geom_line()
按天汇总
ggplot(DT [,。(Page_Views = sum(Page_Views)),by = Post_Day])+
aes(Post_Day,Page_Views)+ geom_line()
通过
data.table $ c>按日期汇总
的分组参数 $ c>被使用,
sum()
作为聚合函数。聚合将数据点的数量从5000减少到1087.因此,该图看起来不那么复杂。
按月汇总
ggplot(DT [,。(Page_Views = (Post_Month,Page_Views)+ geom_line()
<
by =。(Post_Month = lubridate :: floor_date(Post_Day,month))]] +
aes(Post_Month,Page_Views) / code>为了按月汇总,使用了的分组参数
但这次
Post_Day
映射到相应月份的第一天。因此,2014年3月26日
变成了2014-03-01 Post_Month
code>仍然是类POSIXct
。由此,x轴与日期尺度保持连续。这可避免将Post_Day
转换为因子时的麻烦,例如2014-03
使用格式(Post_Day,%Y-%m)
,其中x轴将变为离散。
I have a web visits over time chart which plots daily traffic from 2014 until now, and looks like this:
ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+ geom_line()+ scale_y_continuous(labels = comma)+ ylim(0,50000)
As you can see it's not a great graph, what would make a bit more sense is to break it down by month as opposed to day. However when I try this code:
ggplot(subset(APRA, Post_Day > "2013-12-31"), aes(x = Post_Day, y = Page_Views))+ geom_line()+ scale_y_continuous(labels = comma)+ ylim(0,50000)+ scale_x_date(date_breaks = "1 month", minor_breaks = "1 week", labels = date_format("%B"))
I get this error:
Error: Invalid input: date_trans works with objects of class Date only
The date field
Post_Day
isPOSIXct
.Page_Views
is numeric. Data looks like:Post_Title Post_Day Page_Views Title 1 2016-05-15 139 Title 2 2016-05-15 61 Title 3 2016-05-15 79 Title 4 2016-05-16 125 Title 5 2016-05-17 374 Title 6 2016-05-17 39 Title 7 2016-05-17 464 Title 8 2016-05-17 319 Title 9 2016-05-18 84 Title 10 2016-05-18 64 Title 11 2016-05-19 433 Title 12 2016-05-19 418 Title 13 2016-05-19 124 Title 14 2016-05-19 422
I'm looking to change the X axis from a daily granularity into monthly.
解决方案The sample data set shown in the question has multiple data points per day. So, it needs to be aggregated day-wise anyway. For the aggregation by day or month,
data.table
andlubridate
are used.Create sample data
As no reproducible example is supplied, a sample data set is created:
library(data.table) n_rows <- 5000L n_days <- 365L*3L set.seed(123L) DT <- data.table(Post_Title = paste("Title", 1:n_rows), Post_Day = as.Date("2014-01-01") + sample(0:n_days, n_rows, replace = TRUE), Page_Views = round(abs(rnorm(n_rows, 500, 200))))[order(Post_Day)] DT
Post_Title Post_Day Page_Views 1: Title 74 2014-01-01 536 2: Title 478 2014-01-01 465 3: Title 3934 2014-01-01 289 4: Title 4136 2014-01-01 555 5: Title 740 2014-01-02 442 --- 4996: Title 1478 2016-12-31 586 4997: Title 2251 2016-12-31 467 4998: Title 2647 2016-12-31 468 4999: Title 3243 2016-12-31 498 5000: Title 4302 2016-12-31 309
Plot raw data
Without aggregation the data can be plotted by
library(ggplot2)
ggplot(DT) + aes(Post_Day, Page_Views) + geom_line()
Aggregated by day
ggplot(DT[, .(Page_Views = sum(Page_Views)), by = Post_Day]) +
aes(Post_Day, Page_Views) + geom_line()
To aggregate day-wise the grouping parameter by
of data.table
is used and sum()
as aggregation function. The aggregation is reducing the number of data points from 5000 to 1087. Hence, the plot looks less convoluted.
Aggregated by month
ggplot(DT[, .(Page_Views = sum(Page_Views)),
by = .(Post_Month = lubridate::floor_date(Post_Day, "month"))]) +
aes(Post_Month, Page_Views) + geom_line()
In order to aggregate by month, the grouping parameter by
is used but this time Post_Day
is mapped to the first day of the respective months. So, 2014-03-26
becomes a Post_Month
of 2014-03-01
which is still of class POSIXct
. By this, the x-axis remains continuous with a date scale. This avoids the trouble when converting Post_Day
to factor, e.g, "2014-03"
using format(Post_Day, ""%Y-%m")
, where the x-axis would become discrete.
这篇关于如何使用ggplot2将x轴从几年更改为几个月的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!