解析日志文件中包含2个字符串的行及其之间的行 [英] Parse log file for lines containing 2 strings and the lines inbetween

查看:66
本文介绍了解析日志文件中包含2个字符串的行及其之间的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试解析一些大的日志文件,以检测编码错误的发生.识别缺陷是在不同的行上找到一个字符串序列,中间的日期之间.我很难描述事情,所以举个例子:

I am trying to parse some large log files to detect occurrences of a coding bug. Identifying the defect is finding a sequence of strings on different lines with a date in between. I am terrible at describing things so posting an example:

    <Result xmlns="">
    <Failure exceptionClass="processing" exceptionDetail="State_Open::Buffer Failed - none">
      <SystemID>ffds[sid=EPS_FFDS, 50] Version:01.00.00</SystemID>
      <Description>Lo
ck Server failed </Description>
    </Failure>
  </Result>
</BufferReply>
7/22/2017 8:41:15 AM | SomeServer | Information | ResponseProcessing.TreatEPSResponse() is going to process a response or event. Response.ServiceID [Server_06] Response.Response [com.schema.fcc.ffds.BufferReply]

我将通过多个日志搜索该序列的多个实例:Buffer Failed on,后跟Server_#. Server_#可以是任何2位数字,并且永远不会在同一行上. Buffer failed在找到Server_#之前不会重复. 介于两者之间的日期和时间,但猜测如果可能的话,也会被捕获.

I will be searching for multiple instances of this sequence through multiple logs: Buffer Failed on followed by Server_#. The Server_# can be any 2-digit number and will never be on the same line. Buffer failed will never repeat prior to Server_# being found. The date and time that is in between but guessing that if this is possible it would be captured also.

理想情况下,我会将类似的内容通过管道传输到另一个文件

Ideally, I would pipe something like this to another file


Buffer Failed - none"   7/22/2017 8:41:15 AM [Server_06]

我尝试了一些类似的事情

I have attempted a few things like

Select-String 'Failed - none(.*?)Response.Response' -AllMatches

但它似乎无法跨行工作.

but it doesn't seem to work across lines.

推荐答案

Select-String如果将输入作为单个字符串接收,则只能匹配跨越多行的文本.另外,.通常匹配任何字符 换行符(\n).如果希望它也与换行符匹配,则必须在正则表达式前加上修饰符(?s).否则,您需要一个包含换行符的表达式,例如[\s\S](.|\n).

Select-String can only match text spanning multiple lines if it receives the input as a single string. Plus, . normally matches any character except line feeds (\n). If you want it to match line feeds as well you must prefix your regular expression with the modifier (?s). Otherwise you need an expression that does include line feeds, e.g. [\s\S] or (.|\n).

建议将匹配项定位在expressionDetail而不是实际的细节上,因为这样可以使匹配项更加灵活.

It might also be advisable to anchor the match at expressionDetail rather than the actual detail, because that makes the match more flexible.

类似的东西应该可以为您提供所需的结果:

Something like this should give you the result you're looking for:

$re = '(?s)exceptionDetail="(.*?)".*?(\d+/\d+/\d+ \d+:\d+:\d+ [AP]M).*?\[(.*?)\] Response\.Response'

... | Out-String |
    Select-String -Pattern $re -AllMatches |
    Select -Expand Matches |
    ForEach-Object { '{0} {1} [{2}]' -f $_.Groups[1..3] }

该表达式使用非贪婪匹配和3个捕获组来提取异常详细信息,时间戳和服务器名称.

The expression uses non-greedy matches and 3 capturing groups for extracting exception detail, timestamp and servername.

这篇关于解析日志文件中包含2个字符串的行及其之间的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆