直到满足条件的两个日期之间的天数差 [英] Difference of days between two dates until a condition is met
问题描述
在找到条件之前,我很难计算连续的天数.
下表给出了Gap done
是我用解决方案表
I have got trouble to make to compute the number of days in a row until a condition is found.
It is given in the following table were Gap done
is the messy table I obtained with the solution form there , and Expected gap
the output I want to obtain.
+--------+------------+---------------------+----------+----------------------------------------------------------------------------------------------+
| Player | Result | Date | Gap done | Expected Gap |
+--------+------------+---------------------+----------+----------------------------------------------------------------------------------------------+
| K2000 | Lose | 2015-11-13 13:42:00 | Nan | Nan/0 |
| K2000 | Lose | 2016-03-23 16:40:00 | 131.0 | 131.0 |
| K2000 | Lose | 2016-05-16 19:17:00 | 54.0 | 185.0 |
| K2000 | Win | 2016-06-09 19:36:00 | 54.0 | 239.0 #he always lose before |
| K2000 | Win | 2016-06-30 14:05:00 | 54.0 | 54.0 #because he won last time, it's 54 days btw this current date and the last time he won. |
| K2000 | Lose | 2016-07-29 16:20:00 | 29.0 | 29.0 |
| K2000 | Win | 2016-10-08 17:48:00 | 29.0 | 58.0 |
| Kssis | Lose | 2007-02-25 15:05:00 | Nan | Nan/0 |
| Kssis | Lose | 2007-04-25 16:07:00 | 59.0 | 59.0 |
| Kssis | Not ranked | 2007-06-01 16:54:00 | 37.0 | 96.0 |
| Kssis | Lose | 2007-09-09 14:33:00 | 99.0 | 195.0 |
| Kssis | Lose | 2008-04-06 16:27:00 | 210.0 | 405.0 |
+--------+------------+---------------------+----------+----------------------------------------------------------------------------------------------+
解决方案的问题有并不能真正计算日期.在此示例中,日期始终可能相隔1天.
The issue of the solution there is it does not really compute date. It has the chance that date in this example are always separate by 1 day.
确定我适应了
def sum_days_in_row_with_condition(g):
sorted_g = g.sort_values(by='date', ascending=True)
condition = sorted_g['Result'] == 'Win'
sorted_g['days-in-a-row'] = g.date.diff().dt.days.where(~condition).ffill()
return sorted_g
但是,正如我向您展示的那样,这很混乱.
But as I showed you, this is messy.
所以我考虑了一个解决方案,但是它需要全局变量(功能失效),这有点讲究.
So I thought about a solution, but it needs global variables (out of function), and that's a little fastidious.
任何人都可以以更简单的方式帮助解决此问题吗?
Can anyone help to solve this problematic in a simpler way ?
Pandas版本:0.23.4 Python版本:3.7.4
Pandas version: 0.23.4 Python version: 3.7.4
推荐答案
IIUC,您需要找到布尔掩码m1
,其中win
的前一行也是win
.在m1
中创建一个groupID s
以分隔组win
.将它们分成组并cumsum
IIUC, you need to find the boolean mask m1
where win
has previous row also win
. From m1
create a groupID s
to separate group win
. Split them into group and cumsum
m = df.Result.eq('Win')
m1 = m & m.shift()
s = m1.ne(m1.shift()).cumsum()
df['Expected Gap'] = df.groupby(['Player', s])['Gap done'].cumsum()
Out[808]:
Player Result Date Gap done Expected Gap
0 K2000 Lose 2015-11-13 13:42:00 NaN NaN
1 K2000 Lose 2016-03-23 16:40:00 131.0 131.0
2 K2000 Lose 2016-05-16 19:17:00 54.0 185.0
3 K2000 Win 2016-06-09 19:36:00 54.0 239.0
4 K2000 Win 2016-06-30 14:05:00 54.0 54.0
5 K2000 Lose 2016-07-29 16:20:00 29.0 29.0
6 K2000 Win 2016-10-08 17:48:00 29.0 58.0
7 Kssis Lose 2007-02-25 15:05:00 NaN NaN
8 Kssis Lose 2007-04-25 6:07:00 59.0 59.0
9 Kssis Not-ranked 2007-06-01 16:54:00 37.0 96.0
10 Kssis Lose 2007-09-09 14:33:00 99.0 195.0
11 Kssis Lose 2008-04-06 16:27:00 210.0 405.0
这篇关于直到满足条件的两个日期之间的天数差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!