Linux Shell:检测CSV日志文件上的状态更改 [英] Linux shell: Detecting state changes on a CSV log file
问题描述
这是我自动生成的日志文件(持续一分钟检查我的互联网线路):
This is my automatically-generated log file (continuous one-minute checking of my internet line) :
$ cat log.csv
2018-04-27,23:37,OK
2018-04-27,23:38,OK
2018-04-27,23:39,OK
2018-04-27,23:40,ERROR
2018-04-27,23:41,ERROR
2018-04-27,23:42,OK
2018-04-27,23:43,OK
2018-04-27,23:44,OK
2018-04-27,23:45,OK
我试图通过编写一些依赖于第三字段(行状态:OK/ERROR)状态并删除连续值(某种shell脚本)的解释方法来编写脚本,以使其阅读起来更加舒适仅显示互联网线路的初始和状态更改,将上面的日志转换为:
I am trying to make it more confortable for reading by scripting some interpretation method that depends on the 3rd field (line state: OK/ERROR) state and removes consecutive values, some sort of shell script that only shows the initial and state change for the internet line, transforming the above log into:
2018-04-27,23:37,OK
2018-04-27,23:40,ERROR
2018-04-27,23:42,OK
那将意味着:
2018-04-27,23:37,Entered Status OK
2018-04-27,23:40,Entered Status ERROR
2018-04-27,23:42,Entered Status OK
可以看出,仅保留状态更改将使日志的显示更短,更容易.
As can be seen, leaving only the state changes makes the log's display shorter and easier to read.
假设日志文件可能很长(考虑数分钟的一分钟日志记录),那么什么是在Linux shell上编写此解释"脚本的合适(有效)方法?
Assuming the log file could be very long (consider months of one-minute logging), what could be a proper (efficient) method to script this "interpretation" on a Linux shell?
I know that iterations are not considered a very good idea on shells, so I was thinking about AWK
, but I am not very experienced on it.
Loops solutions could be better than nothing, of course.
其他数据:
- 一个类似(但不尽相同)的问题(我也问过)推荐答案
AWK方法:
awk -F "," '$3==last{next} {last=$3} {print $0}' log.csv
输出:
2018-04-27,23:37,OK 2018-04-27,23:40,ERROR 2018-04-27,23:42,OK
工作原理:
-
-F,"
用作逗号作为字段分隔符. -
$ 3 == last {next}
忽略第三个字段等于last
variable的任何行:命令next
告诉awk跳过所有剩余的命令,然后从下一行重新开始. -
last = $ 3
将第三个字段(OK/ERROR)保存在变量last中. -
{print $ 0}
表示打印当前行.
-F ","
for comma as field separator.$3==last{next}
ignores any line where the third field equals tolast
variable: the commandnext
tells awk to skip all remaining commands and start over on the next line.last=$3
saves the third field (OK/ERROR) in the variable last.{print $0}
means print the current line.
对于任何特殊情况都不是完美的解决方案,即:如果第三个字段为空.但这对我来说足够了.
Not a perfect solution for any special cases, i.e: if the third field is empty. But it will enough for me.
在此线程上感谢John1024
Thanks to John1024 on this thread
这篇关于Linux Shell:检测CSV日志文件上的状态更改的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
-