R: xts 时间戳与真实数据时间戳相差 1 毫秒 [英] R: xts timestamp differ from real data timestamp by 1 millisecond

查看:24
本文介绍了R: xts 时间戳与真实数据时间戳相差 1 毫秒的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有以下数据.

tt <- structure(list(Timestamp = c("2018-03-01 09:51:59.969", "2018-03-01 09:51:59.969", 
"2018-03-01 09:51:59.970", "2018-03-01 09:51:59.971", "2018-03-01 09:51:59.987", 
"2018-03-01 09:51:59.988"), Mid_Px = c(30755.5, 30755, 30755.5, 
30756, 30756.5, 30756.5)), .Names = c("Timestamp", "Mid_Px"), class = "data.frame", row.names = 85774:85779)

看起来像这样:

                    Timestamp  Mid_Px
85774 2018-03-01 09:51:59.969 30755.5
85775 2018-03-01 09:51:59.969 30755.0
85776 2018-03-01 09:51:59.970 30755.5
85777 2018-03-01 09:51:59.971 30756.0
85778 2018-03-01 09:51:59.987 30756.5
85779 2018-03-01 09:51:59.988 30756.5

当我尝试使用下面的代码从中创建一个 xts 对象时,事情开始变得糟糕.

When I try to create an xts object out of it by using the code below, things start to go bad.

tt_ts <- strptime(tt[,1],"%Y-%m-%d %H:%M:%OS")
tt_ts
[1] "2018-03-01 09:51:59.969 CST" "2018-03-01 09:51:59.969 CST" "2018-03-01 09:51:59.970 CST" "2018-03-01 09:51:59.971 CST" "2018-03-01 09:51:59.987 CST"
[6] "2018-03-01 09:51:59.988 CST"
xts(x=tt[,c(-1)], order.by=tt_ts)
                           [,1]
2018-03-01 09:51:59.969 30755.5
2018-03-01 09:51:59.969 30755.0
2018-03-01 09:51:59.970 30755.5
2018-03-01 09:51:59.970 30756.0
2018-03-01 09:51:59.986 30756.5
2018-03-01 09:51:59.987 30756.5

请注意,第 4,5 和 6 行中的毫秒数不正确.

Notice that the milliseconds are incorrect in row 4,5 and 6.

我在这里做错了什么?如何修复它以显示正确的时间戳?

What have I done wrong here ? How can I fix it to display the correct timestamp ?

推荐答案

这类似于 R舍入毫秒的问题.一种简单的解决方案是按照那里的建议添加 0.5 毫秒:

This is similar to R issue with rounding milliseconds. One simple solution would be adding 0.5 ms as suggested there:

tt_ts <- strptime(tt[,1],"%Y-%m-%d %H:%M:%OS") + 0.0005
xts::xts(x=tt[,c(-1)], order.by=tt_ts)
#                            [,1]
# 2018-03-01 09:51:59.969 30755.5
# 2018-03-01 09:51:59.969 30755.0
# 2018-03-01 09:51:59.970 30755.5
# 2018-03-01 09:51:59.971 30756.0
# 2018-03-01 09:51:59.987 30756.5
# 2018-03-01 09:51:59.988 30756.5

我们可以从一个简单的例子中看出这一点:

We can see this from a simple example:

st <- strptime("2018-03-01 09:51:59.971", "%Y-%m-%d %H:%M:%OS")
format(st, "%Y-%m-%d %H:%M:%OS3")
#> [1] "2018-03-01 09:51:59.971"
pt <- as.POSIXct(st)
format(pt, "%Y-%m-%d %H:%M:%OS3")
#> [1] "2018-03-01 09:51:59.970"

转换为 POSIXct 后,ms 是错误的.提高输出精度,我们看到用来表示时间的浮点数刚好低于要求的值,但是R截断了数字而不是四舍五入:

After conversion to POSIXct the ms is wrong. Increasing the output precision, we see that the floating point number used to represent the time is just below the required value, but R truncates the number instead of rounding it:

format(pt, "%Y-%m-%d %H:%M:%OS6")
#> [1] "2018-03-01 09:51:59.970999"

移动所需精度的二分之一可解决此问题.

Shifting by one half of the required precision fixes this.

format(pt + 0.0005, "%Y-%m-%d %H:%M:%OS3")
#> [1] "2018-03-01 09:51:59.971"

一般来说,如果 x 是一个有 3 位小数的数字,则在开放范围内的任何数字 (x - 0.0005, x + 0.0005) 将四舍五入为 x.在截断时,这仍然适用于 [x, x + 0.0005) 内的那些.但是在 (x - 0.0005, x) 范围内的那些将用 x - 0.001 表示,正如您所观察到的.如果我们在截断之前将相关数字移动 0.0005,我们说的是范围 (x, x + 0.001).所有这些数字都将根据需要被截断为 x.

Generally, if x is a number with 3 decimal digits, any number within the open range (x - 0.0005, x + 0.0005) would be rounded to x. On truncation, that would still work for those within [x, x + 0.0005). But those within (x - 0.0005, x) would be represented by x - 0.001 as you observed. If we shift the relevant number by 0.0005 before truncation, we are speaking about the range (x, x + 0.001). All these numbers will be truncated to x as wanted.

我不包括点x ±0.0005 因为对它们进行四舍五入有不同的规则,并且表示时间点的实际浮点数将比这更接近所需的值.

I am excluding the points x ± 0.0005 since there are different rules for rounding them and the actual floating point number representing the time point will be a lot closer to the desired value than this.

关于评论中关于采取差异的问题:如果您将其添加到两个点,则是否添加半毫秒应该无关紧要.需要自行调整的时间点的示例:

Concerning the question in the comments about taking differences: There it should not matter whether you add half a milli-second or not if you add it to both points. Example with a time point that needs adjustment on its own:

st1 <- strptime("2018-03-01 09:51:59.971", "%Y-%m-%d %H:%M:%OS")
format(st1, "%Y-%m-%d %H:%M:%OS3")                              
#> [1] "2018-03-01 09:51:59.970"
pt1 <- as.POSIXct(st1)                                          
format(pt1, "%Y-%m-%d %H:%M:%OS3")                              
#> [1] "2018-03-01 09:51:59.970"
format(pt1 + 0.0005, "%Y-%m-%d %H:%M:%OS3")                     
#> [1] "2018-03-01 09:51:59.971"

还有一个不需要调整的时间点:

And a time point that does not need adjustment:

st2 <- strptime("2018-03-01 09:51:59.969", "%Y-%m-%d %H:%M:%OS")
format(st2, "%Y-%m-%d %H:%M:%OS3")                              
#> [1] "2018-03-01 09:51:59.969"
pt2 <- as.POSIXct(st2)                                          
format(pt2, "%Y-%m-%d %H:%M:%OS3")                              
#> [1] "2018-03-01 09:51:59.969"
format(pt2 + 0.0005, "%Y-%m-%d %H:%M:%OS3")                     
#> [1] "2018-03-01 09:51:59.969"

差异是相同的,独立于任何调整:

Difference is the same independent of any adjustment:

difftime(pt1, pt2, "secs")                                      
#> Time difference of 0.001999855 secs
difftime(pt1 + 0.0005, pt2 + 0.0005, "secs")                    
#> Time difference of 0.001999855 secs

这篇关于R: xts 时间戳与真实数据时间戳相差 1 毫秒的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆