使用Groupby对行进行分组并转换日期&开始日期时间和结束日期时间列的行时间 [英] Grouping rows with Groupby and converting date & time of rows of start date-time and end date- time columns
问题描述
我有一个像这样的数据集:
Blast Hole East Coordinate North Coordinate Collar Theoritical Depth Tag Detector ID Date and Time Detection_Location Detection Date & Time
64 16745.42 107390.32 2634.45 15.95 385656531 23-08-2018 2:39:34 PM CV23 2018-09-08 14:18:17
61 16773.48 107382.6 2634.68 16.18 385760755 23-08-2018 2:38:32 PM CV23 2018-09-08 14:24:19
63 16755.07 107387.68 2634.58 16.08 385262370 23-08-2018 2:39:30 PM CV23 2018-09-08 14:12:42
105 16764.83 107347.67 2634.74 16.24 385742468 23-08-2018 2:41:29 PM CV22 2018-09-06 20:02:46
100 16752.74 107360.32 2634.33 15.83 385112050 23-08-2018 2:41:08 PM CV22 2018-09-06 20:15:42
99 16743.1 107362.96 2634.36 15.86 385087366 23-08-2018 2:41:05 PM CV22 2018-09-06 20:49:21
35 16747.75 107417.68 2635.9 17.4 385453358 23-08-2018 2:36:09 PM CV22 2018-09-23 05:47:44
5 16757.27 107452.4 2636 17.5 385662254 23-08-2018 2:35:03 PM CV22 2018-09-23 05:01:12
19 16770.89 107420.83 2634.81 16.31 385826979 23-08-2018 2:35:50 PM CV22 2018-09-23 05:52:54
我打算在一小时内在一个位置(在Detection_location列中)将具有3个检测结果的所有行分组. 我使用以下代码对每3次检测在一小时内落入的行进行分组:
df2 = df1.groupby([pd.Grouper(key = 'Detection Date & Time', freq = 'H'),
df1.Detection_Location]).size().reset_index(name = 'Tags')
这段代码给了我这样的结果:
我希望得到这样的结果,其中每行都有开始时间,这是在那个小时中第一次检测到那里并且看到最后一次检测时所见的,因此,我希望得到这样的结果:
This is the required output:
Detection Date & Time - Start Detection Date & Time - End Detection_Location Tags
2018-09-06 20:02:46 2018-09-06 20:49:21 CV22 3
2018-09-08 14:12:42 2018-09-08 14:24:19 CV23 3
2018-09-23 05:01:12 2018-09-23 05:47:44 CV22 3
任何人都可以建议我在分组功能中添加哪些内容,以便获得此结果.
谢谢
检查是否适合您.在聚合函数内部,您可以传递要捕获的所有值.
df2 = (df.groupby([pd.Grouper(key = 'Detection Date & Time', freq = 'H'),df.Detection_Location],sort=False)['Detection Date & Time']
.agg(['first','last','size'])).reset_index().rename(columns={"first": "Detection Date & Time - Start", "last": "Detection Date & Time - End", "size": "Tags"})
I have a dataset looking like this:
Blast Hole East Coordinate North Coordinate Collar Theoritical Depth Tag Detector ID Date and Time Detection_Location Detection Date & Time
64 16745.42 107390.32 2634.45 15.95 385656531 23-08-2018 2:39:34 PM CV23 2018-09-08 14:18:17
61 16773.48 107382.6 2634.68 16.18 385760755 23-08-2018 2:38:32 PM CV23 2018-09-08 14:24:19
63 16755.07 107387.68 2634.58 16.08 385262370 23-08-2018 2:39:30 PM CV23 2018-09-08 14:12:42
105 16764.83 107347.67 2634.74 16.24 385742468 23-08-2018 2:41:29 PM CV22 2018-09-06 20:02:46
100 16752.74 107360.32 2634.33 15.83 385112050 23-08-2018 2:41:08 PM CV22 2018-09-06 20:15:42
99 16743.1 107362.96 2634.36 15.86 385087366 23-08-2018 2:41:05 PM CV22 2018-09-06 20:49:21
35 16747.75 107417.68 2635.9 17.4 385453358 23-08-2018 2:36:09 PM CV22 2018-09-23 05:47:44
5 16757.27 107452.4 2636 17.5 385662254 23-08-2018 2:35:03 PM CV22 2018-09-23 05:01:12
19 16770.89 107420.83 2634.81 16.31 385826979 23-08-2018 2:35:50 PM CV22 2018-09-23 05:52:54
I intended to group all the rows having 3 detections at one place ( in column Detection_location) in one hour. I used the following code for grouping the rows falling in one hour per 3 detection:
df2 = df1.groupby([pd.Grouper(key = 'Detection Date & Time', freq = 'H'),
df1.Detection_Location]).size().reset_index(name = 'Tags')
This code gave me a result like this:
I would rather like to have result in which each rows have start time when the first detection was there in that hour and when the last detection was seen and thus i would like to have a result like this:
This is the required output:
Detection Date & Time - Start Detection Date & Time - End Detection_Location Tags
2018-09-06 20:02:46 2018-09-06 20:49:21 CV22 3
2018-09-08 14:12:42 2018-09-08 14:24:19 CV23 3
2018-09-23 05:01:12 2018-09-23 05:47:44 CV22 3
Can anyone suggest what else should i add in my group-by function to get this result.
Thanks
Check if this works for you. Inside the aggregate function, you can pass all the values that you want to capture.
df2 = (df.groupby([pd.Grouper(key = 'Detection Date & Time', freq = 'H'),df.Detection_Location],sort=False)['Detection Date & Time']
.agg(['first','last','size'])).reset_index().rename(columns={"first": "Detection Date & Time - Start", "last": "Detection Date & Time - End", "size": "Tags"})
这篇关于使用Groupby对行进行分组并转换日期&开始日期时间和结束日期时间列的行时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!