pandas :计算列中日期时间对象的频率 [英] Pandas: Counting frequency of datetime objects in a column

查看：97 发布时间：2020/5/24 2:19:41 python python-2.7 pandas

本文介绍了 pandas :计算列中日期时间对象的频率的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一列(从我的原始数据中获得)，我已将其从字符串转换为Pandas中的日期时间对象.

I have a column (from my original data) that I have converted from a string to a datetime-object in Pandas.

该列如下所示:

0     2012-01-15 11:10:12
1     2012-01-15 11:15:01
2     2012-01-16 11:15:12
3     2012-01-16 11:25:01
...
4     2012-01-22 11:25:11
5     2012-01-22 11:40:01
6     2012-01-22 11:40:18
7     2012-01-23 11:40:23
8     2012-01-23 11:40:23
...
9     2012-01-30 11:50:02
10    2012-01-30 11:50:41
11    2012-01-30 12:00:01
12    2012-01-30 12:00:34
13    2012-01-30 12:45:01
...
14    2012-02-05 12:45:13
15    2012-01-05 12:55:01
15    2012-01-05 12:55:01
16    2012-02-05 12:56:11
17    2012-02-05 13:10:01
...
18    2012-02-11 13:10:11
...
19    2012-02-20 13:25:02
20    2012-02-20 13:26:14
21    2012-02-20 13:30:01
...
22    2012-02-25 13:30:08
23    2012-02-25 13:30:08
24    2012-02-25 13:30:08
25    2012-02-26 13:30:08
26    2012-02-27 13:30:08
27    2012-02-27 13:30:08
28    2012-02-27 13:30:25
29    2012-02-27 13:30:25

我想做的是计算每个发生日期的频率.如您所见，我省略了一些日期，但是如果我要手动计算频率(对于可见值)，我将有:

What I would like to do is to count the frequency of each date occurring. As you can see, I have left some dates out, but if I were to compute the frequency manually (for visible values), I would have:

2012-01-15-2(频率)

2012-01-15 - 2 (frequency)

2012-01-16-2

2012-01-16 - 2

2012-01-22-3

2012-01-22 - 3

2012-01-23-2

2012-01-23 - 2

2012-01-30-5

2012-01-30 - 5

2012-02-05-5

2012-02-05 - 5

2012-02-11-1

2012-02-11 - 1

2012-02-20-3

2012-02-20 - 3

2012-02-25-3

2012-02-25 - 3

2012-02-26-1

2012-02-26 - 1

2012-02-27-4

2012-02-27 - 4

这是每天的频率，我想算一下.到目前为止，我已经尝试过:

This is the daily frequency and I would like to count it. I have so far tried this:

df[df.str.contains(r'^\d\d\d\d-\d\d-\d\d')].value_counts()

我知道它会失败，因为它们不是字符串"对象，但是我不确定该如何计算.

我也研究了.dt属性，但是Pandas文档在这些简单的频率计算上非常冗长.

I have also looked at the .dt property, but the Pandas documentation is very verbose on these simple frequency calculations.

也可以概括一下，我该怎么做:

Also, to generalize this, how would I:

将每日频率应用于每周频率(例如，周一至周日)
将每日频率应用于每月频率(例如，我在列中看到"2012-01-**"的次数)
使用其他列的每日/每周/每月限制(例如，如果我的列包含"GET请求"，我想知道每天发生多少，然后每周一次，然后每月一次)
将每周限制与另一个限制一起应用(例如，我有一个列返回"404 Not found"，我想检查每周收到多少 "404 Not found" )

也许解决方案很长，我可能需要做很多事情:split-apply-combine ...但是让我相信Pandas简化/抽象了很多工作，这就是为什么我现在被卡住了.

Perhaps the solution is a long one, where I may need to do lots of: split-apply-combine ... but I was made to believe that Pandas simplifies/abstracts away a lot of the work, which is why I am stuck now.

此文件的源可以被认为等同于服务器日志文件.

The source of this file could be considered something equivalent to a server-log file.

推荐答案

您可以先获取datetime的日期部分，然后使用value_counts:

You can first get the date part of the datetime, and then use value_counts:

s.dt.date.value_counts()

小例子:

In [12]: s = pd.Series(pd.date_range('2012-01-01', freq='11H', periods=6)) In [13]: s Out[13]: 0 2012-01-01 00:00:00 1 2012-01-01 11:00:00 2 2012-01-01 22:00:00 3 2012-01-02 09:00:00 4 2012-01-02 20:00:00 5 2012-01-03 07:00:00 dtype: datetime64[ns] In [14]: s.dt.date Out[14]: 0 2012-01-01 1 2012-01-01 2 2012-01-01 3 2012-01-02 4 2012-01-02 5 2012-01-03 dtype: object In [15]: s.dt.date.value_counts() Out[15]: 2012-01-01 3 2012-01-02 2 2012-01-03 1 dtype: int64

这篇关于 pandas :计算列中日期时间对象的频率的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas :计算列中日期时间对象的频率 [英] Pandas: Counting frequency of datetime objects in a column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas :计算列中日期时间对象的频率 [英] Pandas: Counting frequency of datetime objects in a column

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭