为了绘制时间序列 R,如何设置可变的最小和最大 y 轴? [英] For graphing a time series R, how do I set a variable minimum and maximum y axis?

查看:58
本文介绍了为了绘制时间序列 R,如何设置可变的最小和最大 y 轴?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

随着时间的推移,我有一个大型的水质数据数据集.我需要创建一些标准图形,以便大多数站点具有相同的 y 轴比例,以便我们网站上的查看者可以查看具有标准 y 轴的图形.这使查看者可以更轻松地在视觉上比较两个或多个站点.某些站点的数据超出此标准 y 轴比例,我们不想排除这些数据.我们想设置最小和最大 y 轴作为标准限制,然后在数据大于基本限制时进行扩展.

I have a large dataset of water quality data over time. I need to create somewhat standard graphs so that most sites have the same y axis scale so that viewers on our website can view graphs with a standard y axis. This makes it easier for the viewers to visually compare two or more sites with relative ease. Some sites have data outside of this standard y axis scale and we don't want to exclude those data. We want to set the minimum and maximum y axis as a standard limit and then expand when the data is larger than the base limits.

我有以下数据.我试图找到一个基本数据集,但似乎没有一个适合我正在寻找的.如果存在与我的类似的基本数据集,请为我指出正确的方向,因为如果我有任何问题,这将有助于发布更多问题.对数据长度表示歉意;我想要足够的数据点来模拟实时序列数据(30 年的水质数据).

I have the following data. I tried finding a base dataset, but none seemed to fit what I was looking for. Please point me in the right direction if a base dataset exists that's similar to mine as it will be helpful for posting further questions if I have any. Apologies for the length of data; I wanted enough data points to mimic the real time series data (30 years of water quality data).

y <- c(SITE_ID=1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, YEAR=2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2011, 2012, 2012, 2013, 2013, 2014, 2014, 2015, 2015, 2016, 2016, 2017, 2017, 2018, 2018, 2019, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, PARAMETER="A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "C", "A", "A", "A", "A", "A", "A", "A", "A", "A", "B", "B", "B", "B", "B", "B", "B", "B", "B", "C", "C", "C", "C", "C", "C", "C", "C", "C", VALUE=-11, -20, -50, 97.29, 11.6525, 86.3925, 12.165, 87.465, 12.7975, 91.3125, 13.1025, 98.8275, 12.97, 91.735, 10.5075, 80.3725, 16.1475, 95.395, 0.0475, 0, 0.25, 0.12, 0.3175, 0.0775, 1.2875, 0.0825, 0.9475, 0.2975, 0.26, 0.1, 1.315, 0.02, 0, 0, 0.865, 0, 109.7175, 44.1675, 107.02, 42.7725, 105.065, 43.825, 103.6375, 44.0525, 102.0975, 42.9, 100.045, 43.84, 97.2725, 43.45, 102.56, 47.1875, 94.27, 42.9325, 5, 10, 15, 20, 25, 30, 35, 40, 45, 20, 18, 16, 14, 12, 10, 10, 8, 7, 77, 68, 54, 42, 38, 22, 29, 25, 18)

我从该列表创建数据框,并将字符重新分类为 YEAR 和 VALUE 的数字:

I create the data frame from that list and reclassify the characters as numeric for YEAR and VALUE:

df <- as.data.frame(matrix(y, ncol = 4, dimnames = list(NULL, c("SITE_ID", "YEAR", "PARAMETER", "VALUE"))))
df$VALUE <- as.numeric(as.character(df$VALUE))
df$YEAR <- as.numeric(as.character(df$YEAR))

然后我创建一个循环来为数据中的每个 SITE_ID 创建一个图表.我根据PARAMETER对这个数据集中的三个水质参数进行分组和着色.

I then create a loop to create a graph for each SITE_ID in the data. I group and color by PARAMETER which are the three water quality parameters in this dataset.

for (site_id in unique(df$SITE_ID))
   { 
     p <- filter(df, SITE_ID == site_id) %>%
       ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
       geom_line() +
       theme_classic() +
       scale_y_continuous(limits=c(-55, 150)) + 
       scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))
}
print(p)

现在我手动将 y 轴限制设置为 -55 和 150 以包含所有数据.当我将其限制为 0 和 80 时,部分数据未在下面的代码中绘制(排除).

Right now I manually set the y axis limits to -55 and 150 to include all the data. When I limit it to 0 and 80, some of the data is not graphed (excluded) in the code below.

for (site_id in unique(df$SITE_ID))
   { 
     p <- filter(df, SITE_ID == site_id) %>%
       ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
       geom_line() +
       theme_classic() +
       scale_y_continuous(limits=c(0, 80)) + 
       scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))
}
print(p)

y 轴有没有办法使用标准的 0 和 80 的底数,但是当数据超出限制时扩大限制?

Is there a way to use the standard base of 0 and 80 for the y-axis, but expand the limits when the data goes beyond the limits?

我尝试了以下方法来替换上面的 scale_y_continuous 行,但这似乎是根据整个数据集的最大 VALUE(s) 设置最大值.我们只需要特定 SITE_ID 的最大 VALUE.

I have tried the following to replace the scale_y_continuous line above, but that seems to set the max based on the max VALUE(s) for the entire dataset. We want just the max VALUE(s) for the specific SITE_ID.

scale_y_continuous(limits=c(min(df$VALUE), max(df$VALUE))) +

我想我会限制最小值和最大值以将 df$VALUE 部分指定为该循环的特定 SITE_ID,但不确定如何执行此操作.这就是它可能的样子,只需输入我想要它做什么.我还想合并一个 if/then 语句,以便仅在该站点的任何值大于 80 或小于 0 时运行该语句.

I would imagine I would limit the min and max values to specify the df$VALUE portion to just that specific SITE_ID for that loop, but not sure how to do that. Here's what it might look like just typing out what I would like it to do. I also want to incorporate an if/then statement to only run this when any value for that site is greater than 80 or less than 0.

(
IF min(df$VALUE) > 0 AND max(df$VALUE) <80 
    THEN scale_y_continuous(limits=c(0,80)

ELSE (or IF min(df$VALUE) < 0 OR max(df$VALUE) > 80) 
    THEN scale_y_continuous(limits=c(min(df$VALUE of the current SITE_ID we are graphing), max(df$VALUE of the current SITE_ID we are graphing)))
) +

因此,到目前为止已完成此背景信息和步骤,如果值大于或小于 0 或 80,我如何设置最小和最大 y 轴,但如果所有数据值都在该范围内,则保持 0 和 80 限制范围?

So with this background information and steps completed so far, how do I set the minimum and maximum y axis if the values are more or less than 0 or 80, but keep the 0 and 80 limits if all data values fall within that range?

推荐答案

我会为此编写一个函数...

I would write a function for this...

plot_func <-  function(dataset,site){

 plot_df = filter(dataset, SITE_ID == site)
 min_value = ifelse(min(plot_df$VALUE)>0,0,min(plot_df$VALUE))
 max_value = ifelse(max(plot_df$VALUE)<80,80,max(plot_df$VALUE))

 p <- plot_df %>%
   ggplot(aes(x = YEAR, y = VALUE, group = PARAMETER, color = PARAMETER)) +
   geom_line() +
   theme_classic() +
   scale_y_continuous(limits=c(min_value, max_value)) +
   scale_x_continuous(breaks=c(2011,2013,2015,2017,2019))

  return(p)

}

for (site_id in unique(df$SITE_ID)){
   plot_out <- plot_func(df,site_id)
   print(plot_out)
 }

这篇关于为了绘制时间序列 R,如何设置可变的最小和最大 y 轴?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆