R:strptime()和is.na()意外结果 [英] R: strptime() and is.na () unexpected results

查看:103
本文介绍了R:strptime()和is.na()意外结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框架,大约有800万行和3列。我以下列方式使用了 strptime()

  df $ date .time<  -  strptime(df $ date.time,%m /%d /%y%I:%M:%S%p)

这对1104行中的所有行都是正常的,我用

  df [is.na(df $ date.time),] 

当我看这些问题数据, date.time 条目似乎按照我期望的方式进行格式化。例如,这是一个观察结果,作为一个问题出现,但似乎不是NA:

  id date 。时间结果
观察543490 2012-03-11 02:14:01 C

什么可以可能在这里, is.na(df $ date.time)返回一个TRUE值,这个行显然已经被正确转换了?



这是一个可重现的例子(如果你在CST):

  is.na(strptime (03/11/12 2:14:01 AM,%m /%d /%y%I:%M:%S%p,CST6CDT))
#[1] TRUE


解决方案

问题可能是所有时间返回 NA 不存在您使用的任何时区,由于夏令时。



检查数据源要确定数据记录的时区,然后在调用 strptime tz 参数中设置该值c>。


I have a data frame with roughly 8 million rows and 3 columns. I used strptime() in the following manner:

df$date.time <- strptime(df$date.time, "%m/%d/%y %I:%M:%S %p")

This works fine for all but 1104 of the rows, which I checked using

df[is.na(df$date.time), ]

When I look at these "problem" data, the date.time entries seem to be formatted in the way I would expect. For example, here is an observation that comes up as a problem, but doesn't appear to be an NA:

id                date.time              outcome
observation543490 2012-03-11 02:14:01    C

What could possibly be going on here that is.na(df$date.time) returns a TRUE value for this row that has apparently been converted correctly?

Here's a reproducible example (if you're in CST):

is.na(strptime("03/11/12 2:14:01 AM", "%m/%d/%y %I:%M:%S %p", "CST6CDT"))
#[1] TRUE

解决方案

The problem is likely that all the times that return NA do not exist in whatever timezone you're using, due to daylight saving time.

Check with the data source to determine the timezone the data were recorded in, then set the tz argument to that value in your call to strptime.

这篇关于R:strptime()和is.na()意外结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆