Excel中的as.POSIXct中的数据差异 [英] data difference in `as.POSIXct` with Excel
问题描述
我的实际数据如下:
8/8/2013 15:10
7/26/2013 10:30
7/11/2013 14:20
3/28/2013 16:15
3/18/2013 15:50
当我从excel文件中读取此内容时,R读取为:
When I read this from the excel file, R reads it as:
41494.63
41481.44
41466.60
41361.68
41351.66
所以我用了 as.POSIXct(as.numeric(x [1:5])* 86400,origin = 1899-12-30,tz = GMT)
我得到:
2013-08-08 15:07:12 GMT
2013-07-26 10:33:36 GMT
2013-07-11 14:24:00 GMT
2013-03-28 16:19:12 GMT
2013-03-18 15:50:24 GMT
为什么时间有所不同?
Why there is a difference in time? How to overcome it?
推荐答案
问题是Excel的R都将数字四舍五入到小数点后两位。例如,当您将具有 8/8/2013 15:10
的单元格转换为文本格式(在Mac OSX上为Excel)时,得到的数字为 41494.63194
。
The problem is that either R of Excel is rounding the number to two decimals. When you convert the for example the cell with 8/8/2013 15:10
to text formatting (in Excel on Mac OSX), you get the number 41494.63194
.
使用时:
as.POSIXct(41494.63194*86400, origin="1899-12-30",tz="GMT")
它会给你:
[1] "2013-08-08 15:09:59 GMT"
这比原始日期低1秒钟(这也表明 41494.63194
舍入到小数点后五位)。
This is 1 second off from the original date (which is also an indication that 41494.63194
is rounded to five decimals).
可能最好的解决方案是将excel文件导出到 .csv
或制表符分隔的 .txt
文件,然后将其读入R。这至少为我提供了正确的日期:
Probably the best solution to do is export your excel-file to a .csv
or a tab-separated .txt
file and then read it into R. This gives me at least the correct dates:
> df
datum
1 8/8/2013 15:10
2 7/26/2013 10:30
3 7/11/2013 14:20
4 3/28/2013 16:15
5 3/18/2013 15:50
这篇关于Excel中的as.POSIXct中的数据差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!