按照大 pandas 的日期计算观察次数 [英] Counting observations after grouping by dates in pandas
问题描述
当时间戳不唯一时,在Pandas DataFrame中按日期计算观察次数的最佳方式是什么?
What is the best way to count observations by date in a Pandas DataFrame when the timestamps are non-unique?
df = pd.DataFrame({'User' : ['A', 'B', 'C'] * 40,
'Value' : np.random.randn(120),
'Time' : [np.random.choice(pd.date_range(datetime.datetime(2013,1,1,0,0,0),datetime.datetime(2013,1,3,0,0,0),freq='H')) for i in range(120)]})
理想情况下,输出将提供数字每天的观察次数(或其他较高阶的单位时间)。这可以用来绘制一段时间的活动。
Ideally, the output would provide the number of observations per day (or some other higher order unit of time). This could then be used to plot the activity over time.
2013-01-01 60
2013-01-02 60
推荐答案
un-Panda-ic这样做将使用一系列datetimes转换为日期的Counter对象,将该计数器转换为一个系列,并强制该系列上的索引到数据时间。
The "un-Panda-ic" way of doing this would be using a Counter object on the series of datetimes converted to dates, converting this counter back to a series, and coercing the index on this series to datetimes.
In[1]: from collections import Counter
In[2]: counted_dates = Counter(df['Time'].apply(lambda x: x.date()))
In[3]: counted_series = pd.Series(counted_dates)
In[4]: counted_series.index = pd.to_datetime(counted_series.index)
In[5]: counted_series
Out[5]:
2013-01-01 60
2013-01-02 60
一个更熊猫智能的方式是在系列上使用groupby操作,然后按长度合计输出。
A more "Panda-ic" way would be to use a groupby operation on the series and then aggregate the output by length.
In[1]: grouped_dates = df.groupby(df['Time'].apply(lambda x : x.date()))
In[2]: grouped_dates['Time'].aggregate(len)
Out[2]:
2013-01-01 60
2013-01-02 60
编辑:另一种非常简洁的可能性,从这里是使用 nunique
类:
Another highly concise possibility, borrowed from here is to use the nunique
class:
In[1]: df.groupby(df['Time'].apply(lambda x : x.date())).agg({'Time':pd.Series.nunique})
Out[1]:
2013-01-01 60
2013-01-02 60
除了风格差异外,还有其他优势吗?还有其他方法内置我忽略了吗?
Besides stylistic differences, does one have significant performance advantages over the other? Are there other methods built-in that I've overlooked?
这篇关于按照大 pandas 的日期计算观察次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!