是否有找到日期时间之间差异的函数? [英] Is there a function to find the difference between datetimes?
问题描述
我有多个数据帧,这些数据帧可以具有相同的时间戳(也为+ -1秒),其中包含毫秒.因此,当它们一起放在新的数据框中时,我想过滤出彼此之间相差超过1秒的行
I have multiple dataframes which can have the same timestamps ( also +-1second) that have milliseconds in them. So when they are all together in the new dataframe i want to filter out the rows where they are more than 1 second different from each other
是否存在类似于dftogether['unique'] = np.ediff1d(dftogether['DateTime']
的可用于时间戳的功能?
Is there a function similar to dftogether['unique'] = np.ediff1d(dftogether['DateTime']
that works with timestamps?
我当前的解决方案有效,但是我正在寻找一种合适的方法来实现.
假设我有3个数据帧,分别是df1
,df2
和df3
.对于每个数据框,我都这样做:
My current solution works, but I am looking for a proper way to do it.
Let's say i have 3 dataframes, df1
, df2
and df3
. For each dataframe I do this:
df1['DateTime'] = df1['DateTime'].apply(lambda
x: x.strftime('%Y%d%m%H%M%S'))
df1['DateTime']= df1['DateTime'].astype(np.int64)
将我的DateTime
转换为int
的原因,所以我可以这样做:
Which turns my DateTime
into int
so i can do this:
dftogether= pd.concat(z, sort=True)
dftogether= dftogether.sort_values('DateTime')
dftogether['unique'] = np.ediff1d(dftogether['DateTime'], to_begin=20181211150613411) <1
dftogether= dftogether[dftogether.unique == False]
然后将int
转换回datetime
dftogether['DateTime'] = dftogether['DateTime'].apply(lambda x: pd.to_datetime(str(x), format='%Y%d%m%H%M%S'))
我不知道如何为时间戳创建示例数据,因此我只复制粘贴数据框的一部分.
I couldn't figure out how to create sample data for the timestamps so i will just copypaste parts of the dataframe.
df1
737 2018-12-18 12:37:19.717
738 2018-12-18 12:37:21.936
739 2018-12-18 12:37:22.841
740 2018-12-18 12:37:23.144
877 2018-12-18 12:40:53.268
878 2018-12-18 12:40:56.597
879 2018-12-18 12:40:56.899
880 2018-12-18 12:40:57.300
968 2018-12-18 12:43:31.411
969 2018-12-18 12:43:36.150
970 2018-12-18 12:43:36.452
df2
691 2018-12-18 12:35:23.612
692 2018-12-18 12:35:25.627
788 2018-12-18 12:38:33.248
789 2018-12-18 12:38:33.553
790 2018-12-18 12:38:34.759
866 2018-12-18 12:40:29.487
867 2018-12-18 12:40:31.199
868 2018-12-18 12:40:32.206
df3
699 2018-12-18 12:35:42.452
701 2018-12-18 12:35:45.081
727 2018-12-18 12:36:47.466
730 2018-12-18 12:36:51.796
741 2018-12-18 12:37:23.448
881 2018-12-18 12:40:57.603
910 2018-12-18 12:42:02.904
971 2018-12-18 12:43:37.361
我希望dftogether
看起来像这样,但是带有时间戳而不是int
I want the dftogether
to look like this but with timestamps instead of ints
Unique DateTime
737 False 20181812123719
738 False 20181812123721
739 False 20181812123722
741 False 20181812123723
742 True 20181812123723
740 True 20181812123723
785 False 20181812123830
786 False 20181812123831
787 False 20181812123832
787 True 20181812123832
788 False 20181812123833
所以我可以把那些放在Unique == True
so I can drop the ones where Unique == True
785 False 2018-12-18 12:38:30
786 False 2018-12-18 12:38:31
787 False 2018-12-18 12:38:32
788 False 2018-12-18 12:38:33
790 False 2018-12-18 12:38:34
812 False 2018-12-18 12:39:10
813 False 2018-12-18 12:39:11
还有其他事情:对于新的stackoverflow,我在哪里可以发表自己的看法? IMO这真是糟糕透顶,它一直在滚动,输入/复制代码现在真的很混乱,所有示例程序都使人分心.我花了30多分钟来写这个问题
Something else: Where can I voice my opinion on the new stackoverflow ask a question? IMO this is really awful, it keeps scrolling up, entering/copypasting code is really confusing now and all the For Example is really distracting. It took me more than 30 minutes to write this question
推荐答案
我这样做了,您的初始列是a和b-这是您所需要的吗?
I did this, where your initial columns are a and b - is this what you needed?
from datetime import timedelta
df = pd.concat([df1, df2, df3])
df = df.sort_values('b')
df['s'] = df['b'].shift()
df['d'] = df['b'] - df['s']
df['f'] = df['d'] < timedelta(0, 1)
print(df)
结果:
a b s d f
0 691 2018-12-18 12:35:23.612 NaT NaT False
1 692 2018-12-18 12:35:25.627 2018-12-18 12:35:23.612 00:00:02.015000 False
0 699 2018-12-18 12:35:42.452 2018-12-18 12:35:25.627 00:00:16.825000 False
1 701 2018-12-18 12:35:45.081 2018-12-18 12:35:42.452 00:00:02.629000 False
2 727 2018-12-18 12:36:47.466 2018-12-18 12:35:45.081 00:01:02.385000 False
3 730 2018-12-18 12:36:51.796 2018-12-18 12:36:47.466 00:00:04.330000 False
0 737 2018-12-18 12:37:19.717 2018-12-18 12:36:51.796 00:00:27.921000 False
1 738 2018-12-18 12:37:21.936 2018-12-18 12:37:19.717 00:00:02.219000 False
2 739 2018-12-18 12:37:22.841 2018-12-18 12:37:21.936 00:00:00.905000 True
3 740 2018-12-18 12:37:23.144 2018-12-18 12:37:22.841 00:00:00.303000 True
4 741 2018-12-18 12:37:23.448 2018-12-18 12:37:23.144 00:00:00.304000 True
2 788 2018-12-18 12:38:33.248 2018-12-18 12:37:23.448 00:01:09.800000 False
3 789 2018-12-18 12:38:33.553 2018-12-18 12:38:33.248 00:00:00.305000 True
4 790 2018-12-18 12:38:34.759 2018-12-18 12:38:33.553 00:00:01.206000 False
5 866 2018-12-18 12:40:29.487 2018-12-18 12:38:34.759 00:01:54.728000 False
6 867 2018-12-18 12:40:31.199 2018-12-18 12:40:29.487 00:00:01.712000 False
7 868 2018-12-18 12:40:32.206 2018-12-18 12:40:31.199 00:00:01.007000 False
4 877 2018-12-18 12:40:53.268 2018-12-18 12:40:32.206 00:00:21.062000 False
5 878 2018-12-18 12:40:56.597 2018-12-18 12:40:53.268 00:00:03.329000 False
6 879 2018-12-18 12:40:56.899 2018-12-18 12:40:56.597 00:00:00.302000 True
7 880 2018-12-18 12:40:57.300 2018-12-18 12:40:56.899 00:00:00.401000 True
5 881 2018-12-18 12:40:57.603 2018-12-18 12:40:57.300 00:00:00.303000 True
6 910 2018-12-18 12:42:02.904 2018-12-18 12:40:57.603 00:01:05.301000 False
8 968 2018-12-18 12:43:31.411 2018-12-18 12:42:02.904 00:01:28.507000 False
9 969 2018-12-18 12:43:36.150 2018-12-18 12:43:31.411 00:00:04.739000 False
10 970 2018-12-18 12:43:36.452 2018-12-18 12:43:36.150 00:00:00.302000 True
7 971 2018-12-18 12:43:37.361 2018-12-18 12:43:36.452 00:00:00.909000 True
这篇关于是否有找到日期时间之间差异的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!