将数据框转换为每月时间序列 [英] converting a data frame to monthly time series

查看:15
本文介绍了将数据框转换为每月时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 100 年(1200 个数据点)的月度数据数据框,其中月份为列,年份为行.我想将其转换为每月的时间序列,我尝试了几种方法,但都没有创建正确的时间"结构.

I have a data frame of a monthly data for 100 yrs (1200 data points) with the months in columns and years in the rows. I want to convert it into a monthly time series and I have tried several ways, none of which create the correct "temporal" structure.

问题在于 R 将数据框视为 12 个变量(月)的 100 个观测值(年).这是我最近尝试的可重现代码:

The problem lies with R considering the data frame as a 100 observations (years) of 12 variables (the months). Here is a reproducible code for my latest try:

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200),digits=2),nrow=100,ncol=12))
rownames(dummy.df) <- seq(from=1901, to=2000)
colnames(dummy.df) <- c("jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec")
dummy.df.ts <- ts(as.vector(as.matrix(dummy.df)), start=c(1901,1), end=c(2000,12), frequency=12)

在dummy.df.ts"对象中,行和列被切换,而不是列中的顺序观察,所有 1 月 2 月等都一个接一个地堆叠在一起.我怎样才能得到正确的时间结构?

In the "dummy.df.ts" object, the rows and columns are switched and instead of sequential observations in columns, all the januarys februarys etc are stacked together one after the other. How can I get to the correct temporal structure?

我的数据示例:这些是从 1901 年到 1905 年的每月温度值

An example of my data: these are monthly temperature values from 1901 - 1905

fr.monthly.temp.sample  

     JAN FEB MAR  APR  MAY  JUN  JUL  AUG  SEP  OCT NOV DEC  
1901 2.7 0.4 4.7 10.0 13.0 16.9 19.2 18.3 15.7 10.6 4.9 3.5  
1902 4.1 3.2 7.5 10.3 10.0 15.1 18.2 17.4 15.0 10.2 6.3 3.5  
1903 3.8 5.9 7.6  7.1 12.9 14.9 17.6 17.3 15.5 12.1 6.9 2.7  
1904 3.0 4.6 5.5 10.3 13.6 16.3 20.2 18.5 13.9 11.2 5.4 4.8  
1905 1.7 4.0 7.4  9.3 11.9 16.5 20.0 17.6 14.7  8.4 5.5 3.8  

并且通过使用这个 ts() 调用:

And by using this ts() call:

fr.monthly.temp.sample.ts <- ts(as.vector(as.matrix(fr.monthly.temp.sample)),                              start=c(1901,1), end=c(1905,12), frequency=12)

这是我得到的时间序列对象的输出:

This is the output I get for the time series object:

fr.monthly.temp.sample.ts  

      Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec  
1901  2.7  4.1  3.8  3.0  1.7  0.4  3.2  5.9  4.6  4.0  4.7  7.5  
1902  7.6  5.5  7.4 10.0 10.3  7.1 10.3  9.3 13.0 10.0 12.9 13.6  
1903 11.9 16.9 15.1 14.9 16.3 16.5 19.2 18.2 17.6 20.2 20.0 18.3  
1904 17.4 17.3 18.5 17.6 15.7 15.0 15.5 13.9 14.7 10.6 10.2 12.1  
1905 11.2  8.4  4.9  6.3  6.9  5.4  5.5  3.5  3.5  2.7  4.8  3.8  

--注意改变的时间结构(列中的值现在在行中..)--

--Note the changed temporal structure (values from the columns are now in the rows..)--

谢谢.

推荐答案

解决方案一

你可以在向量化之前转置(函数t())矩阵:

You could transpose (function t()) the matrix before vectorizing it:

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200), digits = 2),
                                 nrow = 100, ncol = 12))
rownames(dummy.df) <- seq(1901, 2000)
colnames(dummy.df) <- month.abb
dummy.df.ts <- ts(as.vector(t(as.matrix(dummy.df))), 
                  start=c(1901,1), end=c(2000,12), frequency=12)

解决方案 2

您可以融化数据,按日期排序,然后应用 ts()功能.

You could melt the data, order by date, then apply the ts() function.

这是数据设置.如果您的语言设置是英语,您可以使用 month.abb 保存一些代码,但这对其他语言区域设置不可靠.

set.seed(12)
dummy.df <- as.data.frame(matrix(round(rnorm(1200),digits=2),nrow=100,ncol=12))
months <- format(seq.Date(as.Date("2013-01-01"), as.Date("2013-12-01"), 
                          by = "month"), format = "%b")
colnames(dummy.df) <- months
dummy.df$Year <- seq(1901, 2000) # set as variable, not as rownames 

融合数据,得到一个包含 1200 行的数据框,每行代表一个观察结果:

Melt the data so you have a data frame with 1200 rows, each representing an observation:

library("reshape2")
dummy.df <- melt(dummy.df, id.vars = "Year")

按日期排列观察结果:

dummy.df$Date <- as.Date(paste(dummy.df$Year, dummy.df$variable, "01", sep = "-"),
                         format = ("%Y-%b-%d"))
dummy.df <- dummy.df[order(dummy.df$Date), ]

然后您可以应用类似的 ts() 调用,其中 ts 对象显示所需的顺序:

Then you can apply a similar ts() call, with the ts object showing the desired order:

dummy.df.ts <- ts(dummy.df$value, start=c(1901,1), end=c(2000,12), frequency=12)

这篇关于将数据框转换为每月时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆