SED-非贪婪的正则表达式似乎无法在sed中工作 [英] SED - Non greedy regex cant seem to work in sed
问题描述
当我从在线RegEx测试工具运行正则表达式模式时,以下文本可以正常工作.但是,在UNIX上的ins中使用
When I run a regex pattern from a online RegEx testing tool on the text below works fine. However, it is not working when using in sed on unix
文字:
<Field1><Field2><Field3>001</Field3></Field2><Field4><FieldDesc>Transaction Successful</FieldDesc></Field4><DtTm><LocalDtTm>2016-07-01-12:05:40.383</LocalDtTm></DtTm><Field5><Field6>N</Field6><Field7></Field7><DtTm><LocalDtTm>2016-07-01-12:05:44.171</LocalDtTm></DtTm></Field5></Field1>
RegEx:
<DtTm>(.*?)<\/DtTm>
Sed中的用法:希望删除< DtTm>
和</DtTm>
Usage in Sed: Looking to remove anything between <DtTm>
and </DtTm>
sed 's/<DtTm>(.*?)<\/DtTm>//g'
预期输出:
<Field1><Field2><Field3>001</Field3></Field2><Field4><FieldDesc>Transaction Successful</FieldDesc></Field4><Field5><Field6>N</Field6><Field7></Field7></Field5></Field1>
推荐答案
GNU sed
有两种模式,基本模式和扩展模式.这些都不是不太高级的 sed
实现的单一基本模式,也不允许非贪婪的规范.根据 info sed
输出:
GNU sed
has two modes, basic and extended. Neither of these, nor the single basic mode of less advanced sed
implementations, permit non-greedy specifications. As per the info sed
output:
请注意,正则表达式匹配器是贪婪的,也就是说,尝试从左到右进行匹配,并且如果可能从同一字符开始两个或多个匹配,它将选择最长的匹配器.
Note that the regular expression matcher is greedy, i.e., matches are attempted from left to right and, if two or more matches are possible starting at the same character, it selects the longest.
因此,如果您需要非贪婪,则必须选择其他工具,例如Perl(或其他支持PCRE的工具),可能 您提到的在线测试工具正在使用.
So, if you need non-greedy, you will have to choose another tool, such as Perl (or something else supporting PCRE), which is probably what the online testing tool you mentioned is using.
好处是,Perl替代命令与 sed
是如此惊人地相似,以至于您经常可以只更改程序名称(并可能在复杂的RE中使用不同的分隔符,所以您不必最终不会像 \/\/\/\/\/
这样的锯齿):
The good thing is, the Perl substitute command is so stunningly similar to the sed
one that you can often just change the program name (and possibly use a different delimiter character in complex REs so you don't end up with sawtooths like \/\/\/\/\/
):
perl -pe 's|<DtTm>.*?</DtTm>||g'
这篇关于SED-非贪婪的正则表达式似乎无法在sed中工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!