计算文件中令牌的出现次数 [英] Count number of occurrences of token in a file

查看:104
本文介绍了计算文件中令牌的出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个服务器访问日志,包含每个http请求的时间戳,我希望获得每秒请求数的计数。使用 sed cut -c ,到目前为止我已经设法将文件缩减为时间戳,例如:

I have a server access log, with timestamps of each http request, I'd like to obtain a count of the number of requests at each second. Using sed, and cut -c, so far I've managed to cut the file down to just the timestamps, such as:


2008年9月22日20:00:21 +0000

22-Sep-2008 20 :00:22 +0000

22-Sep-2008 20:00:22 +0000

22-Sep-2008 20:00:22 +0000

22-Sep-2008 20:00:24 +0000

22-Sep-2008 20:00:24 +0000

22-Sep-2008 20:00:21 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:22 +0000
22-Sep-2008 20:00:24 +0000
22-Sep-2008 20:00:24 +0000

我希望得到的是每个唯一时间戳在文件中出现的次数。例如,对于上面的例子,我想得到如下输出:

What I'd love to get is the number of times each unique timestamp appears in the file. For example, with the above example, I'd like to get output that looks like:


2008年9月22日20:00: 21 +0000:1

22-Sep-2008 20:00:22 +0000:3

22-Sep-2008 20:00:24 +0000:2

22-Sep-2008 20:00:21 +0000: 1
22-Sep-2008 20:00:22 +0000: 3
22-Sep-2008 20:00:24 +0000: 2

我使用 sort -u 将时间戳列表过滤为列表希望我可以像使用grep一样

I've used sort -u to filter the list of timestamps down to a list of unique tokens, hoping that I could use grep like

grep -c -f <file containing patterns> <file>

但是这只产生一行总数匹配的行。

but this just produces a single line of a grand total of matching lines.

我知道这可以在一行中完成,将几个公用程序放在一起...但我想不出哪个公用程序。任何人都知道吗?

I know this can be done in a single line, stringing a few utilities together ... but I can't think of which. Anyone know?

推荐答案

我认为您正在寻找

I think you're looking for

uniq --count




- c, - 根据出现次数计入
前缀行

-c, --count prefix lines by the number of occurrences

这篇关于计算文件中令牌的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆