将字符转换为日期*快速* [英] convert character to date *quickly* in R

查看:144
本文介绍了将字符转换为日期*快速*的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


可能重复:

为什么字符向量上的as.Date慢?


我有一个大的 data.frame (约60 mil的观察),我从一个数据库使用 RMySQL 。日期以字符形式引入(似乎没有办法改变这种情况),所以我使用 as.Date 将事物转换为日期。然而,这需要非常长的时间与这么多的观察。有没有什么可以做得更快?

解决方案

Simon Urbanek的 fasttime 库可以非常快速地处理可分析数据时间的子集:

  R> now<  -  Sys.time()
R> now
[1]2012-10-15 10:07:28.981 CDT
R> fasttime :: fastPOSIXct(format(now))
[1]2012-10-15 05:07:28.980 CDT
R& as.Date(fasttime :: fastPOSIXct(format(now)))
[1]2012-10-15
R&

但是,它只解析ISO格式,并将UTC设为时区。



编辑3 1/2年后:有些评论者似乎认为fasttime包很难安装。我不敢苟同。这里是(再一次)使用 install.r ,这只是一个简单的包装器使用 littler (以及作为示例提供):

  edd @ 〜$ install.r fasttime 
尝试URL'https://cran.rstudio.com/src/contrib/fasttime_1.0-1.tar.gz'
内容类型'application / x-gzip'长度2646字节
========================================= =======
下载2646字节

*安装*源*包'fasttime'...
**包'fasttime'成功解包并检查MD5总数
** libs
ccache gcc -I / usr / share / R / include -DNDEBUG -fpic -g -O2 -fstack-protector-strong -Wformat -Werror = format-security -D_FORTIFY_SOURCE = 2 - g -O3 -Wall -pipe -pedantic -std = gnu99 -c tparse.c -o tparse.o
ccache gcc -shared -L / usr / lib / R / lib -Wl,-Bsymbolic-functions -Wl ,-z,relro -o fasttime.so tparse.o -L / usr / lib / R / lib -lR
安装到/ usr / local / lib / R / site-library / fasttime / libs
** R
**准备包延迟加载
** help
***安装帮助索引
**构建包索引
**测试package可以加载
* DONE(fasttime)

下载的源代码包在
'/ tmp / downloaded_pa​​ckages'
edd @ max:〜$ $ b $正如你所看到的,这个软件包有0个外部依赖,一个源文件和没有丝毫挂钩的构建。b。我们还可以看到,快速时间现在在CRAN上,而在回答是书面。有了这个,Windows和OS X二进制文件现在确实存在于该页面,即使不从源码安装,安装也会像对我一样简单。


Possible Duplicate:
Why is as.Date slow on a character vector?

I have a large data.frame (roughly 60 mil observations) that I read from a database using RMySQL. The dates are brought in as characters (there doesn't seem to be a way to change this) and so I use as.Date to convert things to date. However, this takes an extremely long time witih so many observations. Is there anything one can do to make this faster?

解决方案

Simon Urbanek's fasttime library is very fast for a subset of parseable datetimes:

R> now <- Sys.time()
R> now
[1] "2012-10-15 10:07:28.981 CDT"
R> fasttime::fastPOSIXct(format(now))
[1] "2012-10-15 05:07:28.980 CDT"
R> as.Date(fasttime::fastPOSIXct(format(now)))
[1] "2012-10-15"
R> 

However, it only parse ISO formats and assume UTC as timezone.

Edit after 3 1/2 years: Some commenters appear to think that the fasttime package is difficult to install. I beg to differ. Here is (once again) use install.r which is just a simple wrapper using littler (and also shipped as an example with):

edd@max:~$ install.r fasttime
trying URL 'https://cran.rstudio.com/src/contrib/fasttime_1.0-1.tar.gz'
Content type 'application/x-gzip' length 2646 bytes
==================================================
downloaded 2646 bytes

* installing *source* package ‘fasttime’ ...
** package ‘fasttime’ successfully unpacked and MD5 sums checked
** libs
ccache gcc -I/usr/share/R/include -DNDEBUG      -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -g  -O3 -Wall -pipe -pedantic -std=gnu99  -c tparse.c -o tparse.o
ccache gcc -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o fasttime.so tparse.o -L/usr/lib/R/lib -lR
installing to /usr/local/lib/R/site-library/fasttime/libs
** R
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (fasttime)

The downloaded source packages are in
        ‘/tmp/downloaded_packages’
edd@max:~$ 

As you can see, the package has zero external dependencies, one source file and builds without the slightest hitch. We can also see that fasttime is now on CRAN which was not the case when the answer was written. With that, Windows and OS X binaries now do exist at that page and the installation will be as easy as it was for me even when you do not install from source.

这篇关于将字符转换为日期*快速*的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆