在动物园的一年中使用dplyr summary功能 [英] Using dplyr summary function on yearmon from zoo

查看:207
本文介绍了在动物园的一年中使用dplyr summary功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其值与一年和一个月相关联。我从 zoo 包中使用 yearmon 类来存储年月份信息。



我的目标是计算同一年份的平均值。但是,使用 dplyr 似乎给我一个错误。



变量 tst 以下复制

 > str(tst)
'data.frame':20 ob​​s。的2个变量:
$ n:int 23 24 26 27 26 23 19 19 22 22 ...
$ ym:Class'yearmon'num [1:20] 2004 2004 2004 2004 2004 ...
> dput(tst)
structure(list(n = c(23L,24L,26L,27L,26L,23L,19L,19L,
22L,22L,22L,22L,26L,26L,19L, 22L,26L,25L,22L,18L),
ym = structure(c(2004,2004,2004,2004,2004.08333333333,
2004.08333333333,2004.08333333333,2004.08333333333,2004 / 08333333333,
2004.16666666667,2004.16666666667 ,2004.16666666667,2004.16666666667,
2004.25,2004.25,405.25,2004.25,404.33333333333,2004.33333333333,
2004.33333333333),class =yearmon)),.Names = c(n,ym
),row.names = c(NA,20L),class =data.frame)

而且错误是

 > tst%>%group_by(ym)%>%summaryize(ave = mean(n))
错误:列'ym'不支持类型:yearmon
pre>

有没有办法让它与 zoo dplyr ,或者我必须单独编码我的年份?

解决方案

正如错误所述, dplyr 不支持类。我们可以将 ym 更改为 dplyr 支持的类,它将工作

  library(dplyr)
tst%>%
group_by(ym = as.numeric(ym))%>%
summary(ave = mean(n))
#ym ave
#1 2004.000 25.00000
#2 2004.083 21.80000
#3 2004.167 23.00000
#4 2004.250 23.25000
#5 2004.333 21.66667

或者在评论中提到的G.Grothendieck,我们可以将 group_by 替换为 group_by(ym = as.Date(ym) group_by(ym = format(ym,%Y-%m))


I have a data frame with values associated to a year and month. I use yearmon class from zoo package to store the year-month info.

My aim is to count the average of those values from the same year-month. However, using dplyr seems to give me an error.

The variable tst below for reproduction

> str(tst)
'data.frame':   20 obs. of  2 variables:
 $ n : int  23 24 26 27 26 23 19 19 22 22 ...
 $ ym:Class 'yearmon'  num [1:20] 2004 2004 2004 2004 2004 ...
> dput(tst)
structure(list(n = c(23L, 24L, 26L, 27L, 26L, 23L, 19L, 19L, 
22L, 22L, 22L, 22L, 26L, 26L, 19L, 22L, 26L, 25L, 22L, 18L), 
    ym = structure(c(2004, 2004, 2004, 2004, 2004.08333333333, 
    2004.08333333333, 2004.08333333333, 2004.08333333333, 2004.08333333333, 
    2004.16666666667, 2004.16666666667, 2004.16666666667, 2004.16666666667, 
    2004.25, 2004.25, 2004.25, 2004.25, 2004.33333333333, 2004.33333333333, 
    2004.33333333333), class = "yearmon")), .Names = c("n", "ym"
), row.names = c(NA, 20L), class = "data.frame")

And the error was

> tst %>% group_by(ym) %>% summarize(ave=mean(n))
Error: column 'ym' has unsupported type : yearmon

Is there a way to make it work with both zoo and dplyr, or I'll have to encode my year-month separately?

解决方案

As the error says, the class is not supported in dplyr. We can change the ym to to a class that dplyr supports and it will work

library(dplyr)
tst %>% 
       group_by(ym = as.numeric(ym)) %>%
       summarise(ave = mean(n))
#        ym      ave
#1 2004.000 25.00000
#2 2004.083 21.80000
#3 2004.167 23.00000
#4 2004.250 23.25000
#5 2004.333 21.66667

Or as @G.Grothendieck mentioned in the comments, we can replace the group_by by group_by(ym = as.Date(ym) or group_by(ym = format(ym, "%Y-%m"))

这篇关于在动物园的一年中使用dplyr summary功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆