Perl多行匹配问题 [英] Problem with perl multiline matching

查看:89
本文介绍了Perl多行匹配问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用perl单行代码来更新一些跨越多行的代码,并且看到一些奇怪的行为.这是一个简单的文本文件,显示了我所遇到的问题:

I'm trying to use a perl one-liner to update some code that spans multiple lines and am seeing some strange behavior. Here's a simple text file that shows the problem I'm seeing:

ABCD    START
         STOP    EFGH

我希望以下方法能起作用,但最终不会取代任何东西:

I expected the following to work but it doesn't end up replacing anything:

perl -pi -e 's/START\s+STOP/REPLACE/s' input.txt

做一些实验后,我发现原始正则表达式中的\s+将与换行符匹配,但第二行中的任何空格都不匹配,并且添加第二个\s+也不起作用.因此,现在我正在执行以下解决方法,即添加仅删除换行符的中间正则表达式:

After doing some experimenting I found that the \s+ in the original regex will match the newline but not any of the whitespace on the 2nd line, and adding a second \s+ doesn't work either. So for now I'm doing the following workaround, which is to add an intermediate regex that only removes the newline:

perl -pi -e 's/START\s+/START/s' input.txt

这将创建以下中间文件:

This creates the following intermediate file:

ABCD    START            STOP    EFGH

然后我可以运行原始的正则表达式(尽管不再需要/s):

Then I can run the original regex (although the /s is no longer needed):

perl -pi -e 's/START\s+STOP/REPLACE/s' input.txt

这将创建最终的所需文件:

This creates the final, desired file:

ABCD    REPLACE    EFGH

似乎不需要中间步骤.我想念什么吗?

It seems like the intermediate step should not be necessary. Am I missing something?

推荐答案

perl -p一次处理文件一行.您拥有的正则表达式是正确的,但它永远不会与多行字符串匹配.

perl -p processes the file one line at a time. The regex you have is correct, but it is never matched against the multi-line string.

一个简单的策略(假设文件可以容纳在内存中)是读取整个内容(无需-p即可执行此操作):

A simple strategy, assuming the file will fit in memory, is to read the whole thing (do this without -p):

$/ = undef;
$file = <>;
$file =~ s/START\s+STOP/REPLACE/sg;
print $file;

请注意,我添加了/g修饰符以指定全局替换.

Note, I have added the /g modifier to specify global replacement.

作为所有其他样板的快捷方式,您可以将现有脚本与 -0777 选项:perl -0777pi -e 's/START\s+STOP/REPLACE/sg'.如果您可能需要在文件中进行多次替换,则仍然需要添加/g.

As a shortcut for all that extra boilerplate, you can use your existing script with the -0777 option: perl -0777pi -e 's/START\s+STOP/REPLACE/sg'. Adding /g is still needed if you may need to make multiple replacements within the file.

您可能会遇到打h,尽管不是使用此正则表达式:如果正则表达式为START.+STOP,并且一个文件包含多个START/STOP对,则.+的贪婪匹配将吞噬从第一个START到第一个START的所有内容.终点站.您可以对.+?使用非贪婪匹配(尽可能少地匹配).

A hiccup that you might run into, although not with this regex: if the regex were START.+STOP, and a file contains multiple START/STOP pairs, greedy matching of .+ will eat everything from the first START to the last STOP. You can use non-greedy matching (match as little as possible) with .+?.

如果要在字符串中的任意位置使用^$锚作为线边界,则还需要/m regex修饰符.

If you want to use the ^ and $ anchors for line boundaries anywhere in the string, then you also need the /m regex modifier.

这篇关于Perl多行匹配问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆