正则表达式模式内的条件 [英] Condition inside regex pattern

查看:54
本文介绍了正则表达式模式内的条件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从我的代码中删除任何额外的空格,我正在解析一个文档块.问题是我不想删除 <code>code go here</code> 中的空格.

I would like to remove any extra whitespace from my code, I'm parsing a docblock. The problem is that I do not want to remove whitespace within a <code>code goes here</code>.

例如,我用它来删除多余的空格:

Example, I use this to remove extra whitespace:

$string = preg_replace('/[ ]{2,}/', '', $string);

但我想在 <code></code>

此代码/字符串:

This  is some  text
  This is also   some text

<code>
User::setup(array(
    'key1' => 'value1',
    'key2' => 'value1'
));
</code>

应该改成:

This is some text
This is also some text

<code>
User::setup(array(
    'key1' => 'value1',
    'key2' => 'value1'
));
</code>

我该怎么做?

推荐答案

您并不是真正在寻找条件 - 您需要一种跳过部分字符串的方法,这样它们就不会被替换.这可以通过使用 preg_replace 轻松完成,方法是插入虚拟组并将每个组替换为自身.在您的情况下,您只需要一个:

You aren't really looking for a condition - you need a way to skip parts of the string so they are not replaced. This can be done rather easily using preg_replace, by inserting dummy groups and replacing each group with itself. In your case you only need one:

$str = preg_replace("~(<code>.*?</code>)|^ +| +$|( ) +~smi" , "$1$2", $str);

它是如何工作的?

  • (.*?) - 将 块匹配到第一组中,$1.这假定格式简单且没有嵌套,但如果需要,可能会很复杂.
  • ^ + - 匹配并删除行首的空格.
  • [ ]+$ - 匹配并删除行尾的空格.
  • ( ) + 匹配行中间的两个或多个空格,并将第一个空格捕获到第二组,$2.
  • (<code>.*?</code>) - Match a <code> block into the first group, $1. This assumes simple formatting and no nesting, but can be complicated if needed.
  • ^ + - match and remove spaces on beginnings of lines.
  • [ ]+$ - match and remove spaces on ends of lines.
  • ( ) + match two or more spaces in the middle of lines, and capture the first one to the second group, $2.

替换字符串,$1$2 将保留 块和第一个空格(如果被捕获),并删除它匹配的任何其他内容.

The replace string, $1$2 will keep <code> blocks and the first space if captured, and remove anything else it matches.

要记住的事情:

  • 如果 $1$2 没有被捕获,它将被替换为一个空字符串.
  • 交替 (a|b|c) 从左到右工作 - 当它匹配时,它就满足了,并且不再尝试匹配.这就是为什么 ^ +|+$ 必须在 ( ) + 之前.
  • If $1 or $2 didn't capture, it will be replaced with an empty string.
  • Alternations (a|b|c) work from left to right - when it makes a match it is satisfied, and doesn't try matching again. That is why ^ +| +$ must be before ( ) +.

工作示例:http://ideone.com/HxbaV

Working example: http://ideone.com/HxbaV

这篇关于正则表达式模式内的条件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆