为了绘制时间序列 R,如何设置可变的最小和最大 y 轴? [英] For graphing a time series R, how do I set a variable minimum and maximum y axis?
问题描述
随着时间的推移,我有一个大型的水质数据数据集.我需要创建一些标准图形,以便大多数站点具有相同的 y 轴比例,以便我们网站上的查看者可以查看具有标准 y 轴的图形.这使查看者可以更轻松地在视觉上比较两个或多个站点.某些站点的数据超出此标准 y 轴比例,我们不想排除这些数据.我们想设置最小和最大 y 轴作为标准限制,然后在数据大于基本限制时进行扩展.
I have a large dataset of water quality data over time. I need to create somewhat standard graphs so that most sites have the same y axis scale so that viewers on our website can view graphs with a standard y axis. This makes it easier for the viewers to visually compare two or more sites with relative ease. Some sites have data outside of this standard y axis scale and we don't want to exclude those data. We want to set the minimum and maximum y axis as a standard limit and then expand when the data is larger than the base limits.
我有以下数据.我试图找到一个基本数据集,但似乎没有一个适合我正在寻找的.如果存在与我的类似的基本数据集,请为我指出正确的方向,因为如果我有任何问题,这将有助于发布更多问题.对数据长度表示歉意;我想要足够的数据点来模拟实时序列数据(30 年的水质数据).
I have the following data. I tried finding a base dataset, but none seemed to fit what I was looking for. Please point me in the right direction if a base dataset exists that's similar to mine as it will be helpful for posting further questions if I have any. Apologies for the length of data; I wanted enough data points to mimic the real time series data (30 years of water quality data).
y <- c(SITE_ID=1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, YEAR=2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, PARAMETER="A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "C", VALUE=-11, -20, -50, 97.29, 11.6525, 86.3925, 12.165, 87.465, 12.7975, 91.3125, 13.1025, 98.8275, 12.97, 91.735, 10.5075, 80.3725, 16.1475, 95.395, 0.0475, 0, 0.25, 0.12, 0.3175, 0.0775, 1.2875, 0.0825, 0.9475, 0.2975, 0.26, 0.1, 1.315, 0.02, 0, 0, 0.865, 0, 109.7175, 44.1675, 107.02, 42.7725, 105.065, 43.825, 103.6375, 44.0525, 102.0975, 42.9, 100.045, 43.84, 97.2725, 43.45, 102.56, 47.1875, 94.27, 42.9325, 5, 10, 15, 20, 25, 30, 35, 40, 45, 20, 18, 16, 14, 12, 10, 10, 8, 7, 77, 68, 54, 42, 38, 22, 29, 25, 18)
我从该列表创建数据框,并将字符重新分类为 YEAR 和 VALUE 的数字:
I create the data frame from that list and reclassify the characters as numeric for YEAR and VALUE:
df <- as.data.frame(matrix(y, ncol = 4, dimnames = list(NULL, c("SITE_ID", "YEAR", "PARAMETER", "VALUE"))))
df$VALUE <- as.numeric(as.character(df$VALUE))
df$YEAR <- as.numeric(as.character(df$YEAR))
然后我创建一个循环来为数据中的每个 SITE_ID 创建一个图表.我根据PARAMETER对这个数据集中的三个水质参数进行分组和着色.
I then create a loop to create a graph for each SITE_ID in the data. I group and color by PARAMETER which are the three water quality parameters in this dataset.
for (site_id in unique(df$SITE_ID))
{
p <- filter(df, SITE_ID == site_id) %>%
ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
geom_line() +
theme_classic() +
scale_y_continuous(limits=c(-55, 150)) +
scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))
}
print(p)
现在我手动将 y 轴限制设置为 -55 和 150 以包含所有数据.当我将其限制为 0 和 80 时,部分数据未在下面的代码中绘制(排除).
Right now I manually set the y axis limits to -55 and 150 to include all the data. When I limit it to 0 and 80, some of the data is not graphed (excluded) in the code below.
for (site_id in unique(df$SITE_ID))
{
p <- filter(df, SITE_ID == site_id) %>%
ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
geom_line() +
theme_classic() +
scale_y_continuous(limits=c(0, 80)) +
scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))
}
print(p)
y 轴有没有办法使用标准的 0 和 80 的底数,但是当数据超出限制时扩大限制?
Is there a way to use the standard base of 0 and 80 for the y-axis, but expand the limits when the data goes beyond the limits?
我尝试了以下方法来替换上面的 scale_y_continuous 行,但这似乎是根据整个数据集的最大 VALUE(s) 设置最大值.我们只需要特定 SITE_ID 的最大 VALUE.
I have tried the following to replace the scale_y_continuous line above, but that seems to set the max based on the max VALUE(s) for the entire dataset. We want just the max VALUE(s) for the specific SITE_ID.
scale_y_continuous(limits=c(min(df$VALUE), max(df$VALUE))) +
我想我会限制最小值和最大值以将 df$VALUE 部分指定为该循环的特定 SITE_ID,但不确定如何执行此操作.这就是它可能的样子,只需输入我想要它做什么.我还想合并一个 if/then 语句,以便仅在该站点的任何值大于 80 或小于 0 时运行该语句.
I would imagine I would limit the min and max values to specify the df$VALUE portion to just that specific SITE_ID for that loop, but not sure how to do that. Here's what it might look like just typing out what I would like it to do. I also want to incorporate an if/then statement to only run this when any value for that site is greater than 80 or less than 0.
(
IF min(df$VALUE) > 0 AND max(df$VALUE) <80
THEN scale_y_continuous(limits=c(0,80)
ELSE (or IF min(df$VALUE) < 0 OR max(df$VALUE) > 80)
THEN scale_y_continuous(limits=c(min(df$VALUE of the current SITE_ID we are graphing), max(df$VALUE of the current SITE_ID we are graphing)))
) +
因此,到目前为止已完成此背景信息和步骤,如果值大于或小于 0 或 80,我如何设置最小和最大 y 轴,但如果所有数据值都在该范围内,则保持 0 和 80 限制范围?
So with this background information and steps completed so far, how do I set the minimum and maximum y axis if the values are more or less than 0 or 80, but keep the 0 and 80 limits if all data values fall within that range?
推荐答案
我会为此编写一个函数...
I would write a function for this...
plot_func <- function(dataset,site){
plot_df = filter(dataset, SITE_ID == site)
min_value = ifelse(min(plot_df$VALUE)>0,0,min(plot_df$VALUE))
max_value = ifelse(max(plot_df$VALUE)<80,80,max(plot_df$VALUE))
p <- plot_df %>%
ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
geom_line() +
theme_classic() +
scale_y_continuous(limits=c(min_value, max_value)) +
scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))
return(p)
}
for (site_id in unique(df$SITE_ID)){
plot_out <- plot_func(df,site_id)
print(plot_out)
}
这篇关于为了绘制时间序列 R,如何设置可变的最小和最大 y 轴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!