计算时间戳的行数 [英] count number of rows for a timestamp

查看:67
本文介绍了计算时间戳的行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理数据集

https://pastebin.com/PEFUspiU

我必须将其分组并计算在特定时间段内有多少个请求,然后很容易为我绘制一个图表 时间与请求数之比.

I have to group it and count how many requests are there for a particular period of time and then it will be easy to draw me a chart time vs the number of requests.

例如

**timestamp - number of request**

21-06-2016 09:00:00 - 2

21-06-2016 10:00:00 - 1

21-06-2016 11:00:00 - 5

我如何获得此计数?

谢谢

PS我尝试使用 data ['timestamp'].value_counts(),但收到错误消息:

P.S I tried use data['timestamp'].value_counts() but got errors:

import pandas as pd
import numpy as np
import matplotlib.pylab as plt
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 15, 6

dateparse = lambda dates: pd.datetime.strptime(dates, '%d-%m-%Y %H:%M:%S')
data = pd.read_csv('/home/amfirnas/Desktop/localhost_access_log.2016-06-21.csv',
                   parse_dates=['timestamp'], index_col='timestamp',date_parser=dateparse)

print data.head(25)

# print data['time'].value_counts()

# print data.groupby(['time']).groups.keys()

ts = data['timestamp'].value_counts()

# plt.plot(ts)
# plt.show()

推荐答案

如果要每小时对它们进行计数,则可以对它们进行分组而不是对value_count()进行计数,为此,请确保您的时间戳记是熊猫日期时间:

If you want to count them for each hour, instead of value_count() you can group them and then count, for that, make sure that your timestamp is pandas datetime:

df['timestamp'] = pd.to_datetime(df['timestamp'])
df.groupby(pd.Grouper(key='timestamp', freq="1H")).count()

这篇关于计算时间戳的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆