Perl 正则表达式匹配和删除 [英] Perl Regex Match and Removal

查看:62
本文介绍了Perl 正则表达式匹配和删除的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个以 //#... 开头的字符串到换行符.我已经找出了这个 ..#([^\n]*) 的正则表达式.

I have a string which starts with //#... goes upto the newline characater. I have figured out the regex for the which is this ..#([^\n]*).

我的问题是如果以下条件匹配,你如何从文件中删除这一行

My question is how do you remove this line from a file if the following condition matches

推荐答案

您的正则表达式在几个方面选择不当:

Your regex is badly chosen on several points:

  1. 不是专门匹配两个斜杠,而是使用 .. 来匹配两个可以是任何字符的字符,大概是因为你不知道如何匹配斜杠时也使用它们作为分隔符.(实际上,点匹配几乎任何东西,我们将在 #3 中看到.)

  1. Instead of matching two slashes specifically, you use .. to match two characters that can be anything at all, presumably because you don’t know how to match slashes when you’re also using them as delimiters. (Actually, dots match almost anything, as we’ll see in #3.)

在斜线分隔的正则表达式文字 // 中,您可以简单地通过用反斜线保护斜线来匹配斜线,例如.<代码>/\/\//.然而,更好的变体是使用更长形式的正则表达式文字 m//,您可以在其中选择分隔符,例如.m!!.由于您使用除斜杠以外的其他东西进行分隔,因此您可以在不转义的情况下编写它们:m!//!.请参阅 perldoc perlop.

Within a slash-delimited regex literal, //, you can match slashes simply by protecting them with backslashes, eg. /\/\//. The nicer variant, however, is to use the longer form of regex literal, m//, where you can choose the delimiter, eg. m!!. Since you use something other than slashes for delimitation, you can then write them without escaping them: m!//!. See perldoc perlop.

它没有锚定到字符串的开头,所以它会匹配任何地方.使用前面的 ^ 字符串开头断言.

It’s not anchored to the start of the string so it will match anywhere. Use the ^ start-of-string assertion in front.

您编写了 [^\n] 来匹配除换行符之外的任何字符",但有更简单的编写方法,即 . 通配符.它正是这样做的 - 匹配除换行符之外的任何字符.

You wrote [^\n] to match "any character except newline" when there is a much simpler way to write that, which is just the . wildcard. It does exactly that – match any character except newline.

您使用括号将匹配的一部分分组,但该组既没有量化(您没有指定它可以匹配任何其他次数而不是恰好一次),也没有兴趣保留它.所以括号是多余的.

You are using parentheses to group a part of the match, but the group is neither quantified (you are not specifying that it can match any other number of times than exactly once) nor are you interested in keeping it. So the parentheses are superfluous.

总而言之,这使它成为 m!^//#.*!.但是在正则表达式的末尾放置一个未捕获的 .*(或任何带有 * 量词的东西)是没有意义的,因为它永远不会改变字符串是否匹配:* 很高兴什么都不匹配.

Altogether, that makes it m!^//#.*!. But putting an uncaptured .* (or anything with a * quantifier) at the end of a regex is meaningless, since it never changes whether a string will match or not: the * is happy to match nothing at all.

这样你就可以得到 m!^//#!.

至于从文件中删除该行,正如其他人所解释的那样,逐行阅读并将您想要保留的所有行打印回另一个文件.如果您不在较大的程序中执行此操作,请使用 perl 的命令行开关轻松完成:

As for removing the line from the file, as everyone else explained, read it in line by line and print all the lines you want to keep back to another file. If you are not doing this within a larger program, use perl’s command line switches to do it easily:

perl -ni.bak -e'print unless m!^//#!' somefile.txt

此处,-n 开关使 perl 围绕您提供的代码进行循环,该代码将依次读取您在命令行上传递的所有文件.-i 开关(用于就地")表示从脚本中收集输出并用它覆盖每个文件的原始内容.-i 选项的 .bak 参数告诉 perl 将原始文件的备份保存在一个以原始文件名命名的文件中,并带有 .bak 附加.对于所有这些位,请参阅 perldoc perlrun.

Here, the -n switch makes perl put a loop around the code you provide which will read all the files you pass on the command line in sequence. The -i switch (for "in-place") says to collect the output from your script and overwrite the original contents of each file with it. The .bak parameter to the -i option tells perl to keep a backup of the original file in a file named after the original file name with .bak appended. For all of these bits, see perldoc perlrun.

如果您想在较大程序的上下文中执行此操作,最安全的方法是将文件打开两次,一次用于读取,一次使用 IO::AtomicFile,又是一次写作.IO::AtomicFile 只有成功关闭才会替换原来的文件.

If you want to do this within the context of a larger program, the easiest way to do it safely is to open the file twice, once for reading, and separately, with IO::AtomicFile, another time for writing. IO::AtomicFile will replace the original file only if it’s successfully closed.

这篇关于Perl 正则表达式匹配和删除的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆