sas7bdat 日期格式存储为数字 [英] sas7bdat date format is stored as a number

查看:59
本文介绍了sas7bdat 日期格式存储为数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

用户@spore234 提出的问题似乎已经解决了:sas7bdat日期格式转R日期格式

The problem posed by the user @spore234 appears to be solved: sas7bdat date format to R date format

我也上传了一个 .sas7bdat 文件,但我得到的是数字而不是日期.与@spore234 不同,使用 as.Date(..., origin = "1970-01-01") 是不正确的,但使用 `as.POSIXct(..., origin = ";1970-01-01") 给出了正确的日期,但不是正确的时间.这与 5 年前的发现相反.

I, too, uploaded a .sas7bdat file, and I got numbers instead of dates. Unlike @spore234, the use of as.Date(..., origin = "1970-01-01") is not correct but using `as.POSIXct(..., origin = "1970-01-01") gives the right date but not the right time. It is the reverse of the findings from 5 years ago.

为了避免混淆,我开了一个新问题.

To avoid confusion, I opened up a new issue.

dir <- tmpdir()

#The dates in SAS are 
# SUBMITTED_AT                RUN_DATE
# 06APR2021:16:17:02          11DEC2020:05:00:00
# 06APR2021:16:17:02          11DEC2020:05:00:00

wide <- sas7bdat::read.sas7bdat(paste0(dir, "\\sas7bdat_issue.sas7bdat"))
print(wide)
#>   SUBMITTED_AT   RUN_DATE
#> 1   1933345022 1923282000
#> 2   1933345022 1923282000

as.Date(wide$SUBMITTED_AT, origin = "1970-01-01")
#> [1] "5295288-11-18" "5295288-11-18"

# gives the wrong year, which is not correct. Using `as.POSIXct()` with no modification gives the right day but not the right time. 

as.POSIXct(wide$SUBMITTED_AT,origin='1960-01-01')
#> [1] "2021-04-06 12:17:02 EDT" "2021-04-06 12:17:02 EDT"

这是一个简单的修复,因为我们可以格式化不显示时间.但是,在 SAS--R 转换中会丢失一些数字吗?

It's an easy fix, as we can format not to show the time. However, could some numbers get lost in the SAS--R conversion?

as.POSIXct(wide$RUN_DATE, format = "%Y-%m-%d", origin='1960-01-01')
#> [1] "2020-12-11 EST" "2020-12-11 EST"

xfun::session_info("sas7bdat")
#> R version 4.0.4 (2021-02-15)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 18363)
#> 
#> Locale:
#>   LC_COLLATE=English_United States.1252 
#>   LC_CTYPE=English_United States.1252   
#>   LC_MONETARY=English_United States.1252
#>   LC_NUMERIC=C                          
#>   LC_TIME=English_United States.1252    
#> 
#> Package version:
#>   sas7bdat_0.5

Created on 2021-05-11 by the reprex package (v1.0.0)

SAS (r) Proprietary Software 9.4 (TS1M4)

嗯?这些发现很奇怪.

推荐答案

我认为是时区偏移导致了这个问题.试试

The timezone offset is causing the issue I think. Try

as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC")

EDT 时区与 UTC 相差 4 小时,但它们实际上是完全相同的时间点,这反映在它们在 R 中的存储方式上.唯一的区别是时间的显示方式:

The EDT timezone is 4 hours out from UTC, but they are actually the exact same point in time, and this is reflected in terms of how they are stored in R. The only difference is how the times are displayed:

as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="America/New_York")
#[1] "2021-04-06 12:17:02 EDT" "2021-04-06 12:17:02 EDT"
as.numeric(as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="America/New_York"))
#[1] 1617725822 1617725822

as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC")
#[1] "2021-04-06 16:17:02 UTC" "2021-04-06 16:17:02 UTC"
as.numeric(as.POSIXct(wide$SUBMITTED_AT, origin="1960-01-01", tz="UTC"))
#[1] 1617725822 1617725822

这篇关于sas7bdat 日期格式存储为数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆