dplyr-mutate_each-对POSIXlt的逐级强制转换失败 [英] dplyr - mutate_each - colswise coercion to POSIXlt fails
问题描述
我最近遇到了dplyr,并且-作为新手-非常喜欢。因此,我尝试将一些base-R代码转换为dplyr代码。
I recently came across dplyr and - as a newbie - like it very much. Hence, I try to convert some of my base-R code into dplyr-code.
在处理空中交通管制数据时,我正在使用lubridate和as.POSIXlt强制解析时间戳以解析嵌入在mutate_each()调用中的时间戳。我需要POSIXlt格式,因为以后必须处理本地时间(在不同位置)。
读入数据会得到一个字符数据框。下面是一个简单的示例:
Working with air traffic control data, I am struggling with coercing timestamps using lubridate and as.POSIXlt to parse timestamps embedded in a mutate_each() call. I need the POSIXlt format as I have to work with local times (at different locations) later on. Reading in the data delivers a data frame of characters. The following is a simplistic example:
ICAO_ADEP <- c("DGAA","ZSPD","UAAA","RJTT","KJFK","WSSS")
MVT_TIME_UTC <- c("01-Jan-2013 04:02:24", NA,"01-Jan-2013 04:08:18", NA,"01-Jan-2013 04:17:11","01-Jan-2013 04:21:52")
flights <- data.frame(ICAO_ADEP, MVT_TIME_UTC)
我写的函数如下:
make_POSIXlt <- function(vec, tz="UTC"){
vec <- parse_date_time(vec, orders="dmy_hms", tz=tz)
vec <- as.POSIXlt(vec, tz=tz)
}
使用单个列:
flights$MVT_TIME_UTC <- make_POSIXlt(flights$MVT_TIME_UTC)
如果我运行以下dplyr代码,该函数将失败:
If I run the following dplyr code the function fails:
flights$BLOCK_TIME_UTC <- mutate_each(flights, funs(make_POSIXlt(.)), MVT_TIME_UTC)
Error: wrong result size (9), expected 6 or 1
该问题应与as.POSIXlt调用相关联。如果此行被注释掉,代码将在mutate_each中工作,并将时间戳强制为POSIXct。
The issue should be linked with the as.POSIXlt call. If this line is commented out the code works within mutate_each and coerces the timestamp into POSIXct.
关于什么地方有问题的任何想法/帮助?
显然,我的数据有几个时间戳,我想使用mutate_each(或任何其他合适的dplyr函数)强制使用...
Any idea/help on what is wrong? Obviously, my data has several timestamps that I would like to coerce with mutate_each (or any other suitable dplyr function) ...
推荐答案
大约4年后,我再次提出问题,但我意识到我忘记将其标记为已回答。但是,这也使我有机会记录下(相对)简单的类型强制如何(同时)使用 dplyr
和 lubridate $优雅地解决。 c $ c>。
Revisiting my question about 4 years later, I realised that I forgot to mark it as answered. However, this also gives me the chance to document how this (relatively) simple type coercion can (meanwhile) elegantly solved with dplyr
and lubridate
.
主要经验教训:
- 从不使用带有数据框的POSIXlt(及其后继兄弟,
,尽管您现在可以使用列表列了)。 - 使用
lubridate
包中的有用解析器功能强制日期时间戳记。
- never use POSIXlt with a data frame (and its later brother tibble, although you can now work with list columns).
- coerce date-timestamps with the helpful parser functions from the
lubridate
package.
例如上述示例
ICAO_ADEP <- c("DGAA","ZSPD","UAAA","RJTT","KJFK","WSSS")
MVT_TIME_UTC <- c("01-Jan-2013 04:02:24", NA,"01-Jan-2013 04:08:18", NA,"01-Jan-2013 04:17:11","01-Jan-2013 04:21:52")
flights <- data.frame(ICAO_ADEP, MVT_TIME_UTC)
flights <- flights %>% mutate(MVT_TIME_UTC = lubridate::dmy_hms(MVT_TIME_UTC)
将强制使用MVT_TIME_UTC中的时间戳。请查阅lubridate上的文档以获取其他解析器和/或如何处理本地时区。
will coerce the timestamps in MVT_TIME_UTC. Check the documentation on lubridate for other parsers and/or how to handle local time zones.
这篇关于dplyr-mutate_each-对POSIXlt的逐级强制转换失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!