计算按人分组的 pandas 数据框中的重叠时间范围 [英] Count overlapping time frames in a pandas dataframe, grouped by person

查看：47 发布时间：2021/4/24 20:52:13 python pandas count pandas-groupby overlap

本文介绍了计算按人分组的 pandas 数据框中的重叠时间范围的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用最佳解决方案

I'm using the top solution here to determine the number of rows that have start and end times overlapping with the given row. However, I need these overlaps to be determined by groups and not across the whole dataframe.

我正在使用的数据包含对话的开始和结束时间以及相关人员的姓名:

The data I'm working with has start and end times for conversations and the name of the person involved:

id  start_time              end_time             name
1   2021-02-10 10:37:35     2021-02-10 12:16:22  Bob
2   2021-02-10 11:09:39     2021-02-10 13:06:25  Bob
3   2021-02-10 12:10:33     2021-02-10 17:06:26  Bob
4   2021-02-10 15:05:08     2021-02-10 21:07:05  Sally 
5   2021-02-10 21:07:26     2021-02-10 21:26:37  Sally

这是上一篇文章的解决方案:

This is the solution from the previous post:

ends = df['start_time'].values < df['end_time'].values[:, None]
starts = df['start_time'].values > df['start_time'].values[:, None]
d['overlap'] = (ends & starts).sum(0)
df

但是此记录在对话3和4之间有重叠，而我只是在寻找1-3或4-5之间的重叠.

But this records overlap between conversations 3 and 4, whereas I'm only looking for overlap between 1 - 3 or between 4 - 5.

我现在得到的是:

id  start_time              end_time             name   overlap
1   2021-02-10 10:37:35     2021-02-10 12:16:22  Bob    2
2   2021-02-10 11:09:39     2021-02-10 13:06:25  Bob    1
3   2021-02-10 12:10:33     2021-02-10 17:06:26  Bob    1
4   2021-02-10 15:05:08     2021-02-10 21:07:05  Sally  1 
5   2021-02-10 21:07:26     2021-02-10 21:26:37  Sally  0

我想要得到的东西:

id  start_time              end_time             name   overlap
1   2021-02-10 10:37:35     2021-02-10 12:16:22  Bob    2
2   2021-02-10 11:09:39     2021-02-10 13:06:25  Bob    1
3   2021-02-10 12:10:33     2021-02-10 17:06:26  Bob    0
4   2021-02-10 15:05:08     2021-02-10 21:07:05  Sally  1 
5   2021-02-10 21:07:26     2021-02-10 21:26:37  Sally  0

推荐答案

我认为这可能会满足您的需求.

I think this might give what you need.

添加一个额外的&姓名匹配的条件:

Add in an extra & condition for matching on name too:

ends = df['start_time'].values < df['end_time'].values[:, None]
starts = df['start_time'].values > df['start_time'].values[:, None]
same_group = (df['name'].values == df['name'].values[:, None])

# sum across axis=1 !!!
df['overlap'] = (ends & starts & same_group).sum(1)

df

这篇关于计算按人分组的 pandas 数据框中的重叠时间范围的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

计算按人分组的 pandas 数据框中的重叠时间范围 [英] Count overlapping time frames in a pandas dataframe, grouped by person

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

计算按人分组的 pandas 数据框中的重叠时间范围 [英] Count overlapping time frames in a pandas dataframe, grouped by person

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭