关于将时间序列按两个不同的列分组的问题 [英] Question about grouping time series by two different columns

查看:63
本文介绍了关于将时间序列按两个不同的列分组的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理过去36年中来自60个像素的NDVI数据。我每年有多个NDVI值,但是我尝试使用codyn软件包来计算社区稳定性。但是,community_stability函数要求每个时间变化(即年份)必须有一个值(否则,它将对每个站点的该年份的所有NDVI值求和)。因此,我需要按像素(站点)和年份分组以计算每年的平均值。不过,我很难弄清楚如何对两个不同的因素进行分组。这是我的数据帧布局的快照:

I am working with NDVI data from 60 pixels from the past 36 years. I have multiple NDVI values per year, but I am attempting to calculate community stability using the codyn package. However, the community_stability function requires there to be one value per time variation (i.e., year) (otherwise, it will sum all the NDVI values for that year per site). So, I need to group by pixel(site) and by year to calculate an average per year. I am having difficulty figuring out how to group two different factors, though. Here's a snapshot of my dataframe layout:

       Date year month_num     Season        site  NDVI            site_season
1      5309 1984        07 Transition   M1CAH1SUR 0.317   M1CAH1SUR_Transition
2      5405 1984        10        Dry   M1CAH1SUR 0.208          M1CAH1SUR_Dry
3      5613 1985        05 Transition   M1CAH1SUR 0.480   M1CAH1SUR_Transition
4      5677 1985        07 Transition   M1CAH1SUR 0.316   M1CAH1SUR_Transition
5      5693 1985        08        Dry   M1CAH1SUR 0.315          M1CAH1SUR_Dry

...

有人可以帮我吗并根据每个站点的年份进行分组以计算每个站点中每年的NDVI值?任何帮助将不胜感激!

Can anyone help me with the code to group by year per site to calculate the NDVI values for each year in each of the respective sites? Any help will be greatly appreciated!

我尝试如下使用dplyr:

I tried using dplyr as follows:

NDVIplot_long %>%
+     group_by(site, year, add = TRUE) %>%
+     summarize(mean_NDVI = mean(NDVI, na.rm = TRUE))

但它只返回一个值。

NDVIplot_long %>%
+     group_by(site, year, add = TRUE) %>%
+     summarize(mean_NDVI = mean(NDVI, na.rm = TRUE))

  mean_NDVI
1 0.2825419

我希望有一个所有60个网站的1984、1985、1986等年份的价值。而是仅返回一个值。

I expect to have a value for years 1984, 1985, 1986, etc. for all 60 sites. Instead, only one value was returned.

推荐答案

问题将与 plyr :: summarise 也会被加载,这会掩盖 dplyr 中的相同功能。我们可以指定 dplyr :: summarise

The issue would be related to plyr::summarise loaded as well which masks the same function from dplyr. We can specify dplyr::summarise

library(dplyr)
NDVIplot_long %>%
  group_by(site, year, add = TRUE) %>%
  dplyr::summarize(mean_NDVI = mean(NDVI, na.rm = TRUE))
# A tibble: 2 x 3
# Groups:   site [1]
#  site       year mean_NDVI
#  <chr>     <int>     <dbl>
#1 M1CAH1SUR  1984     0.262
#2 M1CAH1SUR  1985     0.370






单个均值输出也是可重现的(尽管数字不同-可能是整个数据集使用的OP)


The single mean output is reproducible as well(though the numbers are different - could be the OP used the full dataset)

NDVIplot_long %>%
   group_by(site, year, add = TRUE) %>%
   plyr::summarize(mean_NDVI = mean(NDVI, na.rm = TRUE))
#  mean_NDVI
#1    0.3272



data



data

NDVIplot_long <- structure(list(Date = c(5309L, 5405L, 5613L, 5677L, 
         5693L), year = c(1984L, 
1984L, 1985L, 1985L, 1985L), month_num = c(7L, 10L, 5L, 7L, 8L
), Season = c("Transition", "Dry", "Transition", "Transition", 
"Dry"), site = c("M1CAH1SUR", "M1CAH1SUR", "M1CAH1SUR", "M1CAH1SUR",   
"M1CAH1SUR"), NDVI = c(0.317, 0.208, 0.48, 0.316, 0.315),
   site_season = c("M1CAH1SUR_Transition", 
"M1CAH1SUR_Dry", "M1CAH1SUR_Transition", "M1CAH1SUR_Transition", 
"M1CAH1SUR_Dry")), class = "data.frame", row.names = c("1", "2", 
"3", "4", "5"))

这篇关于关于将时间序列按两个不同的列分组的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆