解析日志文件中包含2个字符串的行及其之间的行 [英] Parse log file for lines containing 2 strings and the lines inbetween
问题描述
我正在尝试解析一些大的日志文件,以检测编码错误的发生.识别缺陷是在不同的行上找到一个字符串序列,中间的日期之间.我很难描述事情,所以举个例子:
I am trying to parse some large log files to detect occurrences of a coding bug. Identifying the defect is finding a sequence of strings on different lines with a date in between. I am terrible at describing things so posting an example:
<Result xmlns="">
<Failure exceptionClass="processing" exceptionDetail="State_Open::Buffer Failed - none">
<SystemID>ffds[sid=EPS_FFDS, 50] Version:01.00.00</SystemID>
<Description>Lo
ck Server failed </Description>
</Failure>
</Result>
</BufferReply>
7/22/2017 8:41:15 AM | SomeServer | Information | ResponseProcessing.TreatEPSResponse() is going to process a response or event. Response.ServiceID [Server_06] Response.Response [com.schema.fcc.ffds.BufferReply]
我将通过多个日志搜索该序列的多个实例:Buffer Failed
on,后跟Server_#
.
Server_#
可以是任何2位数字,并且永远不会在同一行上.
Buffer failed
在找到Server_#
之前不会重复.
介于两者之间的日期和时间,但猜测如果可能的话,也会被捕获.
I will be searching for multiple instances of this sequence through multiple logs: Buffer Failed
on followed by Server_#
.
The Server_#
can be any 2-digit number and will never be on the same line.
Buffer failed
will never repeat prior to Server_#
being found.
The date and time that is in between but guessing that if this is possible it would be captured also.
理想情况下,我会将类似的内容通过管道传输到另一个文件
Ideally, I would pipe something like this to another file
Buffer Failed - none" 7/22/2017 8:41:15 AM [Server_06]
我尝试了一些类似的事情
I have attempted a few things like
Select-String 'Failed - none(.*?)Response.Response' -AllMatches
但它似乎无法跨行工作.
but it doesn't seem to work across lines.
推荐答案
Select-String
如果将输入作为单个字符串接收,则只能匹配跨越多行的文本.另外,.
通常匹配任何字符 换行符(\n
).如果希望它也与换行符匹配,则必须在正则表达式前加上修饰符(?s)
.否则,您需要一个包含换行符的表达式,例如[\s\S]
或(.|\n)
.
Select-String
can only match text spanning multiple lines if it receives the input as a single string. Plus, .
normally matches any character except line feeds (\n
). If you want it to match line feeds as well you must prefix your regular expression with the modifier (?s)
. Otherwise you need an expression that does include line feeds, e.g. [\s\S]
or (.|\n)
.
建议将匹配项定位在expressionDetail
而不是实际的细节上,因为这样可以使匹配项更加灵活.
It might also be advisable to anchor the match at expressionDetail
rather than the actual detail, because that makes the match more flexible.
类似的东西应该可以为您提供所需的结果:
Something like this should give you the result you're looking for:
$re = '(?s)exceptionDetail="(.*?)".*?(\d+/\d+/\d+ \d+:\d+:\d+ [AP]M).*?\[(.*?)\] Response\.Response'
... | Out-String |
Select-String -Pattern $re -AllMatches |
Select -Expand Matches |
ForEach-Object { '{0} {1} [{2}]' -f $_.Groups[1..3] }
该表达式使用非贪婪匹配和3个捕获组来提取异常详细信息,时间戳和服务器名称.
The expression uses non-greedy matches and 3 capturing groups for extracting exception detail, timestamp and servername.
这篇关于解析日志文件中包含2个字符串的行及其之间的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!