SED-非贪婪的正则表达式似乎无法在sed中工作 [英] SED - Non greedy regex cant seem to work in sed

查看:84
本文介绍了SED-非贪婪的正则表达式似乎无法在sed中工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从在线RegEx测试工具运行正则表达式模式时,以下文本可以正常工作.但是,在UNIX上的ins中使用

When I run a regex pattern from a online RegEx testing tool on the text below works fine. However, it is not working when using in sed on unix

文字:

<Field1><Field2><Field3>001</Field3></Field2><Field4><FieldDesc>Transaction Successful</FieldDesc></Field4><DtTm><LocalDtTm>2016-07-01-12:05:40.383</LocalDtTm></DtTm><Field5><Field6>N</Field6><Field7></Field7><DtTm><LocalDtTm>2016-07-01-12:05:44.171</LocalDtTm></DtTm></Field5></Field1>

RegEx:

<DtTm>(.*?)<\/DtTm>

Sed中的用法:希望删除< DtTm> </DtTm>

Usage in Sed: Looking to remove anything between <DtTm> and </DtTm>

sed 's/<DtTm>(.*?)<\/DtTm>//g'

预期输出:

<Field1><Field2><Field3>001</Field3></Field2><Field4><FieldDesc>Transaction Successful</FieldDesc></Field4><Field5><Field6>N</Field6><Field7></Field7></Field5></Field1>

推荐答案

GNU sed 有两种模式,基本模式和扩展模式.这些都不是不太高级的 sed 实现的单一基本模式,也不允许非贪婪的规范.根据 info sed 输出:

GNU sed has two modes, basic and extended. Neither of these, nor the single basic mode of less advanced sed implementations, permit non-greedy specifications. As per the info sed output:

请注意,正则表达式匹配器是贪婪的,也就是说,尝试从左到右进行匹配,并且如果可能从同一字符开始两个或多个匹配,它将选择最长的匹配器.

Note that the regular expression matcher is greedy, i.e., matches are attempted from left to right and, if two or more matches are possible starting at the same character, it selects the longest.

因此,如果您需要非贪婪,则必须选择其他工具,例如Perl(或其他支持PCRE的工具),可能 您提到的在线测试工具正在使用.

So, if you need non-greedy, you will have to choose another tool, such as Perl (or something else supporting PCRE), which is probably what the online testing tool you mentioned is using.

好处是,Perl替代命令与 sed 是如此惊人地相似,以至于您经常可以只更改程序名称(并可能在复杂的RE中使用不同的分隔符,所以您不必最终不会像 \/\/\/\/\/这样的锯齿):

The good thing is, the Perl substitute command is so stunningly similar to the sed one that you can often just change the program name (and possibly use a different delimiter character in complex REs so you don't end up with sawtooths like \/\/\/\/\/):

perl -pe 's|<DtTm>.*?</DtTm>||g'

这篇关于SED-非贪婪的正则表达式似乎无法在sed中工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆