为什么coord_equal破坏了我的热度图 [英] Why does coord_equal break my heatmap
问题描述
我正在尝试根据以下数据创建热图:
I'm trying to create a heatmap out of the following data:
> head(myData.aggregated)
datetime value date time
1 2016-03-31 14:19:00 3 2016-03-31 2016-06-11 14:19:00
2 2016-03-31 14:49:00 69 2016-03-31 2016-06-11 14:49:00
3 2016-03-31 15:49:00 5 2016-03-31 2016-06-11 15:49:00
4 2016-03-31 16:19:00 7 2016-03-31 2016-06-11 16:19:00
5 2016-03-31 17:49:00 2 2016-03-31 2016-06-11 17:49:00
6 2016-03-31 18:19:00 7 2016-03-31 2016-06-11 18:19:00
> tail(myData.aggregated)
datetime value date time
90 2016-04-06 13:19:00 1 2016-04-06 2016-06-11 13:19:00
91 2016-04-06 13:49:00 25 2016-04-06 2016-06-11 13:49:00
92 2016-04-06 14:19:00 7 2016-04-06 2016-06-11 14:19:00
93 2016-04-06 14:49:00 1 2016-04-06 2016-06-11 14:49:00
94 2016-04-06 22:19:00 3 2016-04-06 2016-06-11 22:19:00
95 2016-04-06 22:49:00 14 2016-04-06 2016-06-11 22:49:00
以及以下ggplot2命令.
And the following ggplot2 commands.
ggplot(myData.aggregated, aes(x = time, y = date, fill = scale(value))) + geom_tile() + coord_equal()
一旦我添加coord_equal(),结果就是一个空白图表.有人可以向我解释为什么会发生这种情况以及如何解决它.我的目标是每隔30分钟获取一张带有正方形图块的热图.
As soon as I add coord_equal() the result is a blank graph. Can someone explain to me why this is happening and how I can fix it. My goal is to get a heatmap with square tiles for each 30 min interval.
更新1:
> dput(head(myData.aggregated))
structure(list(datetime = structure(c(1459426740, 1459428540,
1459432140, 1459433940, 1459439340, 1459441140), class = c("POSIXct",
"POSIXt"), tzone = ""), value = c(3L, 69L, 5L, 7L, 2L, 7L), date = structure(c(16891,
16891, 16891, 16891, 16891, 16891), class = "Date"), time = structure(c(1465647540,
1465649340, 1465652940, 1465654740, 1465660140, 1465661940), class = c("POSIXct",
"POSIXt"), tzone = "")), .Names = c("datetime", "value", "date",
"time"), row.names = c(NA, 6L), class = "data.frame")
推荐答案
TL; DR::y轴跨度为六个单位,x轴跨度为数万个单位.添加 coord_equal
时,y轴将被压缩到x轴物理长度的大约1/10,000,有效地使绘图区域消失. date
列(y轴)以天为单位, time
列(x轴)以秒为单位,但是ggplot均将它们视为无单位数字.您也可以以秒为单位指定y轴,但这仍将为您提供不理想的长宽比至少为6:1的图.有关代码和其他详细信息,请参见下文.
TL;DR: The y-axis spans six units and the x-axis spans tens-of-thousands of units. When you add coord_equal
, the y-axis gets squashed to roughly 1/10,000th the physical length of the x-axis, effectively making the plot area disappear. The date
column (y-axis) happens to be in days and the time
column (x-axis) in seconds, but both are treated as unitless numbers by ggplot. You can denominate the y-axis in seconds also, but that will still give you a plot with an undesirable aspect ratio of at least 6:1. See below for code and additional detail.
正在发生的事情: date
是 Date
格式,因此以天为单位,范围为6天. time
采用 POSIXct
格式,以秒为单位,范围为几十(因为我们只对一天中的时间感兴趣,而不考虑日期)-几千秒(最多86,400秒或一天的长度).
Here's what's happening: date
is in Date
format and is therefore denominated in days, with a range of 6 days. time
is in POSIXct
format, which is denominated in seconds, with a range (since we're only interested in the time of day, regardless of date) of tens-of-thousands of seconds (up to a maximum of 86,400 seconds, or the length of one day).
日期日期
和 POSIXct
格式的基础值分别是带有 Date
和 POSIXct
的数字值附带的课程.结果,当您添加 coord_equal
时,y轴上的一个单位所占的物理距离与x轴上的1个单位所占用的物理距离相同,因为ggplot(显然)会计算 coord_equal
基于值的数字幅度,而不考虑其日期时间类.但是整个y轴跨度为6个单位,而x轴跨度为数万个单位.因此,当您需要 coord_equal
时,y:x的宽高比将被压缩到大约1:10,000左右,从而使绘图在所有实际用途中都消失了.
The underlying values of Date
and POSIXct
formats are just numeric values with, respectively, Date
and POSIXct
classes attached. As a result, when you add coord_equal
, one unit on the y-axis takes up the same physical distance as 1 unit on the x-axis because ggplot (apparently) calculates coord_equal
based on the numeric magnitudes of the values, without regard to their date-time class. But the entire y-axis spans 6 units while the x-axis spans tens-of-thousands of units. Thus, when you require coord_equal
, the y:x aspect ratio gets squashed to on the order of 1:10,000 or so, making the plot disappear for all practical purposes.
您可以以秒为单位指定x轴和y轴,但是即使这样,y轴的范围(6天)的范围也至少是x轴(最多一天)的六倍,结果为ay:使用 coord_equal
的x长宽比至少为6:1,比1:1:1更好,但仍然不是很实用.
You can denominate both the x and y axes in seconds, but even then the y-axis will span at least six times the range (6 days) as the x-axis (maximum of one day), resulting in a y:x aspect ratio of at least 6:1 with coord_equal
, which is better than 1:10,000, but still not very practical.
这是一个伪造数据的例子:
Here's an example with fake data:
# Fake data
set.seed(4959)
dat = data.frame(datetime=seq(as.POSIXct("2016-03-31"), as.POSIXct("2016-04-06"), by="hour"))
dat$value = sample(1:50, nrow(dat), replace=TRUE)
ggplot(dat,
aes(x = as.POSIXct(as.numeric(datetime) %% 86400,
tz="UTC", origin=as.Date("2016-01-01")),
y = as.POSIXct(as.Date(datetime)),
fill = scale(value))) +
geom_tile() +
labs(y="Date", x="Time") +
scale_x_datetime(date_labels="%H:%m") +
coord_equal()
在上面的代码中,要创建y值,我们首先将其转换为 POSIXct
来转换单位到秒,但对于给定日期的所有 datetime
值,该时间等于当天的午夜.
In the code above, to create the y values we first convert to Date
format, which eliminates the time of day and then convert back to POSIXct
which converts the unit to seconds, but with time equal to midnight on that day for all datetime
values on a given date.
要创建x值,我们只想要一天中的时间,以午夜后的秒为单位,因此我们将除以86400(一天中的秒数)后的 datetime
数值的余数.要使小时正确,必须使用 tz = UTC
,要使函数运行,需要 origin
(可以是任何日期;我们只需要一天中的时间)没有错误.
To create the x values, we just want time of day in seconds after midnight, so we calculate the remainder of the numeric value of datetime
after division by 86400 (number of seconds in a day). The tz=UTC
is necessary to get the hours right and origin
(which can be any date; we just want the time of day) is necessary to get the function to run without an error.
下面是有和没有 coord_equal
时的情节图.请注意,使用 coord_equal
的x轴跨越一天的时间(从午夜到午夜),其长度与y轴上的一天相同.这是因为我们以秒为单位指定了y和x值.但是,只要y轴跨度数天,而x轴仅跨度数天, coord_equal
将导致不良的长宽比.
Below is what the plot looks like with and without coord_equal
. Note that with coord_equal
the x-axis, which spans one day of time (from midnight to midnight) has the same length as one day on the y axis. That's because we denominated both the y and x values in seconds. However, as long as the y axis spans several days and the x-axis spans only one day, coord_equal
will result in an undesirable aspect ratio.
以下是如果y值以天而不是秒为单位,并且指定了 coord_equal
的情况,则y轴如何相对于x轴进行挤压:
Below is a demonstration of how the y-axis gets squashed relative to the x-axis if the y values are denominated in days rather than seconds and coord_equal
is specified:
ggplot(dat,
aes(x = as.POSIXct(as.numeric(datetime) %% 86400,
tz="UTC", origin=as.Date("2016-01-01")),
y = as.Date(datetime),
fill = scale(value))) +
geom_tile() +
labs(y="Date", x="Time") +
scale_x_datetime(date_labels="%H:%m") +
coord_equal()
这篇关于为什么coord_equal破坏了我的热度图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!