削减时间法术到日历月在大 pandas [英] cut time spells into calendar months in pandas
问题描述
我有法术(医院住宿)的数据,每个都有开始和结束日期,但我想计算日历月在医院里的天数。当然,这个数字可以为零,几个月没有出现在一个法术。但我不能把每个法术的长度归因于开始的一个月,因为更长的法术会持续到下个月(或更多)。
基本上,它足以满足如果我可以在月初的数据时间削减法术,从第一个例子中的数据到第二个数据: id start end
1 2011-01-01 10:00:00 2011-01-08 16:03:00
2 2011-01-28 03:45:00 2011-02- 04 15:22:00
3 2011-03-02 11:04:00 2011-03-05 05:24:00
ID开始结束月份逗留
1 2011 -01-01 10:00:00 2011-01-08 16:03:00 2011-01 7
2 2011-01-28 03:45:00 2011-01-31 23:59:59 2011- 01 4
2 2011-02-01 00:00:00 2011-02-04 15:22:00 2011-02 4
3 2011-03-02 11:04:00 2011-03- 05 05:24:00 2011-03 3
我读了时间序列/日期功能的大熊猫,但我没有看到一个直接的解决方案。如何实现切片?
这比你想象的简单:只是减去日期。结果是一个时间跨度。请参阅添加列,其中包含日期之间的天数DataFrame pandas
您甚至可以一次为整个框架执行此操作:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.subtract.html
更新,现在我更了解问题。
添加新列:take the spell's end date;如果开始日期在不同的月份,则将此新日期的日期设置为01,时间设置为00:00。
这是您可以使用的日期时间计算可归因于每个月的逗留部分。 cut-start是第一个月; end-cut是第二个。
I have data on spells (hospital stays), each with a start and end date, but I want to count the number of days spent in hospital for calendar months. Of course, this number can be zero for months not appearing in a spell. But I cannot just attribute the length of each spell to the starting month, as longer spells run over to the following month (or more).
Basically, it would suffice for me if I could cut spells at turn-of-month datetimes, getting from the data in the first example to the data in the second:
id start end
1 2011-01-01 10:00:00 2011-01-08 16:03:00
2 2011-01-28 03:45:00 2011-02-04 15:22:00
3 2011-03-02 11:04:00 2011-03-05 05:24:00
id start end month stay
1 2011-01-01 10:00:00 2011-01-08 16:03:00 2011-01 7
2 2011-01-28 03:45:00 2011-01-31 23:59:59 2011-01 4
2 2011-02-01 00:00:00 2011-02-04 15:22:00 2011-02 4
3 2011-03-02 11:04:00 2011-03-05 05:24:00 2011-03 3
I read up on the Time Series / Date functionality of pandas, but I do not see a straightforward solution to this. How can one accomplish the slicing?
It's simpler than you think: just subtract the dates. The result is a time span. See Add column with number of days between dates in DataFrame pandas
You even get to do this for the entire frame at once: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.subtract.html
Update, now that I understand the problem better. Add a new column: take the spell's end date; if the start date is in a different month, then set this new date's day to 01 and the time to 00:00.
This is the cut DateTime you can use to compute the portion of the stay attributable to each month. cut - start is the first month; end - cut is the second.
这篇关于削减时间法术到日历月在大 pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!