如何最有效地转换字符串“ 2014年1月1日”到POSIXct,即“ 2014-01-01” yyyy-mm-dd [英] How to most efficiently convert a character string of "01 Jan 2014" to POSIXct i.e. "2014-01-01" yyyy-mm-dd

查看:63
本文介绍了如何最有效地转换字符串“ 2014年1月1日”到POSIXct,即“ 2014-01-01” yyyy-mm-dd的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这里,我已经对该问题有部分答案,据解释,我对该问题的理解是:如何在data.table中最有效地重组字符串以实现fasttime。

I already have a partial answer to the problem here, which I understand as far as it is explained: How to most efficiently restructure a character string for fasttime in data.table

但是,此任务已扩展,需要处理原始格式的变化。

However, the task has been extended, and needs to deal with a variation of the orginal formatting.

我有一个很大的数据集,其中有一列字符类的日期格式如下:

I have a large dataset, with a column of dates of character class in the form of:

01 Jan 2014

或:

dd MMM yyyy

我想对其进行重组以馈入 fastPOSIXct ,它仅接受<$ c中的字符输入$ c> POSIXct 订单:

Which I want to restructure to feed into fastPOSIXct which only accepts character input in POSIXct order:

yyyy-mm-dd

以上链接的问题指出,一种有效的方法是使用正则表达式,然后提供输出直到 fast.time 。在这里,我是否需要扩展它以包括一种了解每月缩写,转换为数字然后重新排列的方法?我该怎么做?我知道有一个 month.abb 作为内置常量。我应该使用这个,还是有一个更聪明的方法?

The above linked question notes that an efficient approach would be to use regex and then supply the output to fast.time. Here do I need to extend this to include a method to understand monthly abbreviations, convert to numeric, then rearrange? How would I do this? I know that there is a month.abb as a built in constant. Should I be using this, or is there a smarter way?

推荐答案

使用 lubridate 怎么办:

x <- "01 Jan 2014"
x
[1] "01 Jan 2014"
library(lubridate)
dmy(x)
[1] "2014-01-01 UTC"

当然, lubridate 函数也接受 tz 参数。要查看可接受参数的完整列表,请参见 OlsonNames()

Of course lubridate functions accept tz argument too. To see a complete list of acceptable arguments see OlsonNames()

我决定使用 micro基准测试 c软件包和 lubridate 用一些经验数据更新此答案。

I decided to update this answer with some empirical data using the micro benchmark package and the lubridate option for use fasstime.

library(micro benchmark)
microbenchmark(dmy(x), times = 10000)
Unit: milliseconds
   expr      min      lq     mean   median      uq     max neval
 dmy(x) 1.992639 2.02567 2.142212 2.041514 2.07153 39.1384 10000

options(lubridate.fasttime = T)

microbenchmark(dmy(x), times = 10000)
Unit: milliseconds
   expr      min      lq     mean   median       uq      max neval
 dmy(x) 1.993326 2.02488 2.136748 2.039467 2.065326 163.2008 10000

这篇关于如何最有效地转换字符串“ 2014年1月1日”到POSIXct,即“ 2014-01-01” yyyy-mm-dd的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆