削减时间法术到日历月在大 pandas [英] cut time spells into calendar months in pandas

查看:172
本文介绍了削减时间法术到日历月在大 pandas 的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有法术(医院住宿)的数据,每个都有开始和结束日期,但我想计算日历月在医院里的天数。当然,这个数字可以为零,几个月没有出现在一个法术。但我不能把每个法术的长度归因于开始的一个月,因为更长的法术会持续到下个月(或更多)。

基本上,它足以满足如果我可以在月初的数据时间削减法术,从第一个例子中的数据到第二个数据:

  id start end 
1 2011-01-01 10:00:00 2011-01-08 16:03:00
2 2011-01-28 03:45:00 2011-02- 04 15:22:00
3 2011-03-02 11:04:00 2011-03-05 05:24:00

ID开始结束月份逗留
1 2011 -01-01 10:00:00 2011-01-08 16:03:00 2011-01 7
2 2011-01-28 03:45:00 2011-01-31 23:59:59 2011- 01 4
2 2011-02-01 00:00:00 2011-02-04 15:22:00 2011-02 4
3 2011-03-02 11:04:00 2011-03- 05 05:24:00 2011-03 3

我读了时间序列/日期功能的大熊猫,但我没有看到一个直接的解决方案。如何实现切片?

解决方案

这比你想象的简单:只是减去日期。结果是一个时间跨度。请参阅添加列,其中包含日期之间的天数DataFrame pandas



您甚至可以一次为整个框架执行此操作:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.subtract.html






更新,现在我更了解问题。
添加新列:take the spell's end date;如果开始日期在不同的月份,则将此新日期的日期设置为01,时间设置为00:00。



这是您可以使用的日期时间计算可归因于每个月的逗留部分。 cut-start是第一个月; end-cut是第二个。


I have data on spells (hospital stays), each with a start and end date, but I want to count the number of days spent in hospital for calendar months. Of course, this number can be zero for months not appearing in a spell. But I cannot just attribute the length of each spell to the starting month, as longer spells run over to the following month (or more).

Basically, it would suffice for me if I could cut spells at turn-of-month datetimes, getting from the data in the first example to the data in the second:

id                    start                     end
 1      2011-01-01 10:00:00     2011-01-08 16:03:00
 2      2011-01-28 03:45:00     2011-02-04 15:22:00
 3      2011-03-02 11:04:00     2011-03-05 05:24:00

id                    start                     end     month      stay
 1      2011-01-01 10:00:00     2011-01-08 16:03:00   2011-01         7
 2      2011-01-28 03:45:00     2011-01-31 23:59:59   2011-01         4
 2      2011-02-01 00:00:00     2011-02-04 15:22:00   2011-02         4
 3      2011-03-02 11:04:00     2011-03-05 05:24:00   2011-03         3

I read up on the Time Series / Date functionality of pandas, but I do not see a straightforward solution to this. How can one accomplish the slicing?

解决方案

It's simpler than you think: just subtract the dates. The result is a time span. See Add column with number of days between dates in DataFrame pandas

You even get to do this for the entire frame at once: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.subtract.html


Update, now that I understand the problem better. Add a new column: take the spell's end date; if the start date is in a different month, then set this new date's day to 01 and the time to 00:00.

This is the cut DateTime you can use to compute the portion of the stay attributable to each month. cut - start is the first month; end - cut is the second.

这篇关于削减时间法术到日历月在大 pandas 的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆