Python-按月分组日期 [英] Python - Group Dates by Month
问题描述
这是一个快速的问题,我一开始就认为这很容易.一个小时后,我不太确定!
因此,我有一个Python datetime
对象的列表,并且想要对其进行图形处理. x值是年份和月份,y值是该列表中本月发生的日期对象的数量.
也许有一个例子可以更好地说明这一点(dd/mm/yyyy):
Here's a quick problem that I, at first, dismissed as easy. An hour in, and I'm not so sure!
So, I have a list of Python datetime
objects, and I want to graph them. The x-values are the year and month, and the y-values would be the amount of date objects in this list that happened in this month.
Perhaps an example will demonstrate this better (dd/mm/yyyy):
[28/02/2018, 01/03/2018, 16/03/2018, 17/05/2018]
-> ([02/2018, 03/2018, 04/2018, 05/2018], [1, 2, 0, 1])
我的第一次尝试是按照以下方式简单地按日期和年份分组:
My first attempt tried to simply group by date and year, along the lines of:
import itertools
group = itertools.groupby(dates, lambda date: date.strftime("%b/%Y"))
graph = zip(*[(k, len(list(v)) for k, v in group]) # format the data for graphing
您可能已经注意到,它只会按列表中已经存在的日期进行分组.在上面的示例中,没有一个日期发生在4月这一事实将被忽略.
As you've probably noticed though, this will group only by dates that are already present in the list. In my example above, the fact that none of the dates occurred in April would have been overlooked.
接下来,我尝试查找开始日期和结束日期,并在它们之间的几个月内循环:
Next, I tried finding the starting and ending dates, and looping over the months between them:
import datetime
data = [[], [],]
for year in range(min_date.year, max_date.year):
for month in range(min_date.month, max_date.month):
k = datetime.datetime(year=year, month=month, day=1).strftime("%b/%Y")
v = sum([1 for date in dates if date.strftime("%b/%Y") == k])
data[0].append(k)
data[1].append(v)
当然,这仅在min_date.month
小于max_date.month
时有效,而跨度多年则不一定.另外,它还很丑陋.
Of course, this only works if min_date.month
is smaller than max_date.month
which is not necessarily the case if they span multiple years. Also, its pretty ugly.
是否有一种优雅的方法?
预先感谢
Is there an elegant way of doing this?
Thanks in advance
编辑:要清楚,日期是datetime
对象,而不是字符串.为了便于阅读,它们在这里看起来像字符串.
EDIT: To be clear, the dates are datetime
objects, not strings. They look like strings here for the sake of being readable.
推荐答案
我建议使用 pandas
:
I suggest use pandas
:
import pandas as pd
dates = ['28/02/2018', '01/03/2018', '16/03/2018', '17/05/2018']
s = pd.to_datetime(pd.Series(dates), format='%d/%m/%Y')
s.index = s.dt.to_period('m')
s = s.groupby(level=0).size()
s = s.reindex(pd.period_range(s.index.min(), s.index.max(), freq='m'), fill_value=0)
print (s)
2018-02 1
2018-03 2
2018-04 0
2018-05 1
Freq: M, dtype: int64
s.plot.bar()
说明:
- 首先从
Series
> s并转换to_datetime
s. - 通过
Series.dt.to_period
创建PeriodIndex
通过索引 -
groupby
(level=0
),并通过GroupBy.size
- 通过
Series.reindex
通过PeriodIndex
由最大值和最小值创建索引 - 最后的情节,例如用于酒吧-
Series.plot.bar
- First create
Series
from list ofdate
s and convertto_datetime
s. - Create
PeriodIndex
bySeries.dt.to_period
groupby
by index (level=0
) and get counts byGroupBy.size
- Add missing periods by
Series.reindex
byPeriodIndex
created by max and min values of index - Last plot, e.g. for bars -
Series.plot.bar
这篇关于Python-按月分组日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!