sas7bdat 日期格式存储为数字 [英] sas7bdat date format is stored as a number
问题描述
用户@spore234 提出的问题似乎已经解决了:sas7bdat日期格式转R日期格式
The problem posed by the user @spore234 appears to be solved: sas7bdat date format to R date format
我也上传了一个 .sas7bdat
文件,但我得到的是数字而不是日期.与@spore234 不同,使用 as.Date(..., origin = "1970-01-01")
是不正确的,但使用 `as.POSIXct(..., origin = ";1970-01-01") 给出了正确的日期,但不是正确的时间.这与 5 年前的发现相反.
I, too, uploaded a .sas7bdat
file, and I got numbers instead of dates. Unlike @spore234, the use of as.Date(..., origin = "1970-01-01")
is not correct but using `as.POSIXct(..., origin = "1970-01-01") gives the right date but not the right time. It is the reverse of the findings from 5 years ago.
为了避免混淆,我开了一个新问题.
To avoid confusion, I opened up a new issue.
dir <- tmpdir()
#The dates in SAS are
# SUBMITTED_AT RUN_DATE
# 06APR2021:16:17:02 11DEC2020:05:00:00
# 06APR2021:16:17:02 11DEC2020:05:00:00
wide <- sas7bdat::read.sas7bdat(paste0(dir, "\\sas7bdat_issue.sas7bdat"))
print(wide)
#> SUBMITTED_AT RUN_DATE
#> 1 1933345022 1923282000
#> 2 1933345022 1923282000
as.Date(wide$SUBMITTED_AT, origin = "1970-01-01")
#> [1] "5295288-11-18" "5295288-11-18"
# gives the wrong year, which is not correct. Using `as.POSIXct()` with no modification gives the right day but not the right time.
as.POSIXct(wide$SUBMITTED_AT,origin='1960-01-01')
#> [1] "2021-04-06 12:17:02 EDT" "2021-04-06 12:17:02 EDT"
这是一个简单的修复,因为我们可以格式化不显示时间.但是,在 SAS--R 转换中会丢失一些数字吗?
It's an easy fix, as we can format not to show the time. However, could some numbers get lost in the SAS--R conversion?
as.POSIXct(wide$RUN_DATE, format = "%Y-%m-%d", origin='1960-01-01')
#> [1] "2020-12-11 EST" "2020-12-11 EST"
xfun::session_info("sas7bdat")
#> R version 4.0.4 (2021-02-15)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#>
#> Locale:
#> LC_COLLATE=English_United States.1252
#> LC_CTYPE=English_United States.1252
#> LC_MONETARY=English_United States.1252
#> LC_NUMERIC=C
#> LC_TIME=English_United States.1252
#>
#> Package version:
#> sas7bdat_0.5
Created on 2021-05-11 by the reprex package (v1.0.0)
SAS (r) Proprietary Software 9.4 (TS1M4)
嗯?这些发现很奇怪.
推荐答案
我认为是时区偏移导致了这个问题.试试
The timezone offset is causing the issue I think. Try
as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC")
EDT 时区与 UTC 相差 4 小时,但它们实际上是完全相同的时间点,这反映在它们在 R 中的存储方式上.唯一的区别是时间的显示方式:>
The EDT timezone is 4 hours out from UTC, but they are actually the exact same point in time, and this is reflected in terms of how they are stored in R. The only difference is how the times are displayed:
as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="America/New_York")
#[1] "2021-04-06 12:17:02 EDT" "2021-04-06 12:17:02 EDT"
as.numeric(as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="America/New_York"))
#[1] 1617725822 1617725822
as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC")
#[1] "2021-04-06 16:17:02 UTC" "2021-04-06 16:17:02 UTC"
as.numeric(as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC"))
#[1] 1617725822 1617725822
这篇关于sas7bdat 日期格式存储为数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!