如何更改数据框中的时区? [英] How to change a time zone in a data frame?

查看:39
本文介绍了如何更改数据框中的时区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用

str <- readLines("Messages.csv", n=-1, skipNul=TRUE)
matches <- str_match(str, pattern = "\\s*([0-9]{2}/[0-9]{2}/[0-9]{4}),\\s*([0-9]{2}:[0-9]{2}:[0-9]{2}),\\s*(Me|Them),\\s*(\\+[0-9]{11,12}),\\s*((?s).*)")
df <- data.frame(matches[, -1], stringsAsFactors=F)
colnames(df) <- c("date","time","sender","phone number","msg")


# Format the date and create a row with the number of characters of the messages
df <- df %>%
mutate(posix.date=parse_date_time(paste0(date,time),"%d%m%y%H%M%S"),tz="Europe/London") %>%           
 mutate(nb.char = nchar(msg)) %>%
 select(posix.date, sender, msg, nb.char) %>%
 arrange(as.numeric(posix.date))

我可以使用

# Change the senders' names
df <- df %>%
  mutate(sender = replace(sender, sender == "Me", "Mr. Awesome")) 

但我想将数据的时区从 tz="America/Los_Angeles"

But I want to change the time zone for the data from to tz="America/Los_Angeles"

我尝试了以下两种方法都没有成功:

I have tried the follow both without success:

attributes(df)$tz<-"America/Los_Angeles"

这编译但似乎没有任何改变

this compiles but nothing seems to change

还有这个:

df <- df %>%
mutate(date = replace(date, format(date, tz="America/Los_Angeles",usetz=TRUE)))

给出错误:eval(expr,envir, enclos) 中的错误:缺少参数values",没有默认值"

which gives the error: "Error in eval(expr, envir, enclos) : argument "values" is missing, with no default"

也许我没有正确指定原始时区,但我真的不知道如何检查它是否通过.

Perhaps I am not specifying the original time zone correctly, but I have no idea really how to check that it went through.

谢谢!

推荐答案

首先,您可以更改 POSIXct 变量的时区.更改 data.frame 中的时区"没有意义,因此设置 data.frame 的 "tz" 属性没有任何作用.

First, you can change the time zone of a POSIXct variable. It is not meaningful to "change the time zone in a data.frame", so setting a "tz" attribute of a data.frame does nothing.

[ 注意:然而,更改 xts 对象的时区是有意义的.请参阅这篇文章.]

[ Note: it is meaningful, however, to change the time zone of an xts object. See this post. ]

我了解到您的时间戳采用 GMT 格式,并且您希望将其转换为 PST 格式的等效时间戳.如果这是您的意图,那么这应该可行:

I gather that your timestamps are in GMT and you want to convert that to the equivalent in PST. If this is what you are intending, then this should work:

df$posix.date <- as.POSIXct(as.integer(df$posix.date),
                            origin="1970-01-01", 
                            tz="American/Los_Angeles")

例如:

x <- as.POSIXct("2015-01-01 12:00:00", tz="Europe/London")
x
# [1] "2015-01-01 12:00:00 GMT"
as.POSIXct(as.integer(x),origin="1970-01-01",tz="America/Los_Angeles")
# [1] "2015-01-01 04:00:00 PST"

这里的问题是 as.POSIXct(...) 的工作方式不同,具体取决于传递给它的对象的类.如果您传递一个字符或整数,则时区将根据 tz=... 设置.如果传递的对象已经是 POSIXct,则忽略 tz=... 参数.因此,这里我们将 x 转换为整数,以便遵守 tz=... 参数.

The issue here is that as.POSIXct(...) works differently depending on the class of the object passed to it. If you pass a character or integer, the time zone is set according to tz=.... If you pass an object that is already POSIXct, the tz=... argument is ignored. So here we convert x to integer so the tz=... argument is respected.

真的很纠结.如果有更简单的方法,我很想听听.

Really convoluted. If there's an easier way I'd love to hear about it.

这篇关于如何更改数据框中的时区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆