如何将非常大的数据集转换为 xts?- as.xts 在 1.5M 行上失败 [英] how to convert a very large dataset to xts? - as.xts fails on 1.5M rows

查看:29
本文介绍了如何将非常大的数据集转换为 xts?- as.xts 在 1.5M 行上失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有数据:

dput(head(data))

> dput(head(data))
structure(list(Gmt.time = c("01.06.2015 00:00", "01.06.2015 00:01", 
"01.06.2015 00:02", "01.06.2015 00:03", "01.06.2015 00:04", "01.06.2015 00:05"
), Open = c(0.88312, 0.88337, 0.88377, 0.88412, 0.88393, 0.8838
), High = c(0.88337, 0.88378, 0.88418, 0.88418, 0.88393, 0.88393
), Low = c(0.883, 0.88337, 0.88374, 0.88394, 0.88368, 0.88362
), Close = c(0.88337, 0.88375, 0.88412, 0.88394, 0.8838, 0.88393
), Volume = c(83.27, 100.14, 117.18, 52.53, 77.69, 91.63)), .Names = c("Gmt.time", 
"Open", "High", "Low", "Close", "Volume"), row.names = c(NA, 
6L), class = "data.frame")
> 

并且没有 NA 值

any(is.na(head(data)))
[1] FALSE

如果我在提供的数据中的前几个元素上运行它:

if i run this on the first few elements as in the data provided:

data_xts <- xts(head(data[,2:6]), as.POSIXct(head(data[,1]), format='%d.%m.%Y %H:%M'))

效果很好

但是如果我在完整数据集上运行

but if i run on full dataset

> nrow(data)
[1] 1581120

我明白了:

> data_xts <- xts(data[,2:6], as.POSIXct(data[,1], format='%d.%m.%Y %H:%M'))
Error in xts(data[, 2:6], as.POSIXct(data[, 1], format = "%d.%m.%Y %H:%M")) : 
  'order.by' cannot contain 'NA', 'NaN', or 'Inf'

推荐答案

如果您的时间戳如列名称所示采用 GMT,则 as.POSIXct(data[,1], format='%d.%m.%Y %H:%M') 可能会返回 NA 因为时区尚未设置为 UTC 并且默认情况下假定本地时区.您可能有本地时区中不存在的时间戳,它会返回 NA.即,尝试 as.POSIXct(data[,1], format='%d.%m.%Y %H:%M', tz = "GMT").

If your timestamps are in GMT as column name implies, then as.POSIXct(data[,1], format='%d.%m.%Y %H:%M') may be returning NA because timezone has not been set to UTC and local timezone is assumed by default. You may have a timestamp that doesn't exist in local timezone, which would return NA. I.e., try as.POSIXct(data[,1], format='%d.%m.%Y %H:%M', tz = "GMT").

我猜测返回 NA 的第一条记录包含一个小时内的时间戳,由于您当地时区的夏令时更改(即不存在)而跳过该时间戳;如此处所述.

I am guessing that the first record returning NA contains a timestamp during an hour that is skipped due to daylight savings changes (i.e., does not exist) in your local time zone; as described here.

这篇关于如何将非常大的数据集转换为 xts?- as.xts 在 1.5M 行上失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆