Perl 正则表达式压缩多个换行符 [英] Perl Regex To Condense Multiple Line Breaks

查看:59
本文介绍了Perl 正则表达式压缩多个换行符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎无法找出正确的语法,但我想要一个 Perl 正则表达式来查找一行中有两个或多个换行符的位置,并将它们压缩为 2 个换行符.

I can't seem to figure out the right syntax but I want a Perl regular expression to find where there are two or more line breaks in a row and condense them into just 2 line breaks.

这是我今天使用的,但似乎不起作用:

Here is what I'm using today which doesn't seem to work:

$string =~ s/\n\n+/\n\n/g;

请告诉我我做错了什么以及我应该使用正确的 Perl 正则表达式.

Please let me know what I'm doing wrong and the correct Perl regex I should be using.

预先感谢您的帮助!

推荐答案

如果您使用的是 Perl 5.10 或更高版本,请尝试以下操作:

If you're using Perl 5.10 or later, try this:

$string =~ s/(\R)(?:\h*\R)+/$1$1/g;

\R 是通用的行分隔符转义序列 (ref),并且 \h 匹配任何水平空白字符(例如空格和制表符)(参考).因此,这会将一个或多个 空白 行的任何序列转换为一个 行.

\R is the generic line-separator escape sequence (ref), and \h matches any horizontal whitespace character (e.g. space and TAB) (ref). So this will convert any sequence of one or more blank lines to one empty line.

如今的大多数应用程序在他们将识别为行分隔符方面都是自由的;他们甚至会接受在同一个文档中混合使用两种或多种分隔符样式.另一方面,一些应用程序会主动将所有行分隔符转换为一种首选样式.但有时你确实必须坚持一种特定的风格;这就是为什么我捕获第一个 \R 匹配并将其用作替换,而不是随意使用 \n.

Most applications these days are liberal in what they'll recognize as a line separator; they'll even accept a mix of two or more styles of separator in the same document. On the other hand, some apps actively convert all line separators to one preferred style. But sometimes you do have to stick to one particular style; that's why I captured the first \R match and used it as the replacement, instead of arbitrarily using \n.

请注意,这些特殊的转义序列在其他正则表达式中并未得到广泛支持.它们适用于最新版本的 PHP,而 \R 似乎适用于 Ruby 2.0,但我找不到任何提及它的文档.Ruby 1.9.2 和 2.0 支持 \h 转义序列,但它匹配十六进制数字 ([0-9a-fA-F]),而不是水平空格.在大多数其他风格中,\R\h 将抛出异常或匹配文字 Rh分别.

Be aware that these special escape sequences aren't widely supported in other regex flavors. They work in recent versions of PHP, and \R seems to work in Ruby 2.0, though I can't find any doc that mentions it. Ruby 1.9.2 and 2.0 support a \h escape sequence, but it matches a hexadecimal digit ([0-9a-fA-F]), not horizontal whitespace. In most other flavors, \R and \h will either throw an exception or match a literal R and h respectively.

这篇关于Perl 正则表达式压缩多个换行符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆