按日期对日期时间数据进行排序,但从4PM到4PM [英] Sort Datetime data by day, but from 4PM to 4PM

查看:108
本文介绍了按日期对日期时间数据进行排序,但从4PM到4PM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我每天都有关于公司的各种推文,我希望每天将它们归类.我已经做到了.但是,我不想将它们从00:00到23:59排序,而是从16:00到15:59(因为纽约证券交易所开放时间)进行排序.

I have Tweets from various times a day about companies, and I want to group them all by day. I have already done this. However, I want to sort them not from 00:00 until 23:59, but instead from 16:00 until 15:59 (because of the NYSE open hours).

推文(负面,中立和正面代表情感):

Tweets (Negative, Neutral and Positive is for the sentiment):

 Company,Datetime_UTC,Negative,Neutral,Positive,Volume
 AXP,2013-06-01 16:00:00+00:00,0,2,0,2
 AXP,2013-06-01 17:00:00+00:00,0,2,0,2
 AXP,2013-06-02 05:00:00+00:00,0,1,0,1
 AXP,2013-06-02 16:00:00+00:00,0,2,0,2

我的代码:

 Tweets$Datetime_UTC <- as.Date(Tweets$Datetime)
 Sent <- aggregate(list(Tweets$Negative, Tweets$Neutral, Tweets$Positive), by=list(Tweets$Company, Tweets$Datetime_UTC), sum)
 colnames(Sent) <- c("Company", "Date", "Negative", "Neutral", "Positive")
 Sent <- Sent[order(Sent$Company),]

该代码的输出:

 Company,Date,Negative,Neutral,Positive
 AXP,2013-06-01,0,4,0
 AXP,2013-06-02,0,3,0

我希望如何(考虑一天应该从16:00开始):

How I'd want it to be (considering that a day should start at 16:00):

 Company,Date,Negative,Neutral,Positive
 AXP,2013-06-02,0,5,0
 AXP,2013-06-03,0,2,0  

如您所见,我的代码几乎可以正常工作.我只想按不同的时间窗口进行排序.

As you can see, my code almost works. I just want to sort after different time windows.

如何执行此操作?一个想法是将+ 8h加到每个Datetime_UTC上,这会将16:00更改为00:00.在此之后,我可以只使用我的代码.有可能吗?

How to do this? One idea would be to just add +8h to every single Datetime_UTC, which would change 16:00 into 00:00. After this, I could just use my code. Would that be possible?

提前谢谢!! :-)

推荐答案

有效地,您正在做的是将日期重新定义为16:00,而不是00:00.一种选择是转换为纪元时间(距1970:01:01 00:00:00+00:00的秒数,然后将数据向前滑动八小时.

Effectively what you're doing is redefining a date to start at 16:00 instead of 00:00. One option would be to convert to epoch time (seconds since 1970:01:01 00:00:00+00:00 and simply slide your data forward by eight hours.

您可以转换为纪元秒,然后再增加8个小时的秒数,然后全部转换回Date类.然后,您将像以前一样聚合.

You can convert to epoch seconds, then add 8 hours worth of seconds, and then convert back to Date class all in one line. Then you would just aggregate as you had been.

Tweets$Datetime_UTC <- as.Date(as.integer(as.POSIXct(Tweets)) + 28800)

用它替换第一行代码,它应该可以解决问题.

Replace your first line of code with that and it should do the trick.

这篇关于按日期对日期时间数据进行排序,但从4PM到4PM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆