PHP 正则表达式不包括 <pre>标签 [英] PHP Regex expression excluding <pre> tag

查看:43
本文介绍了PHP 正则表达式不包括 <pre>标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一个名为 Acronyms 的 WordPress 插件(https://wordpress.org/plugins/acronyms/).这个插件用它们的描述替换首字母缩略词.它使用 PHP PREG_REPLACE 函数.

问题在于它替换了 <pre> 标签中包含的首字母缩略词,我用它来呈现源代码.

您能否修改此表达式,使其不会替换<pre> 标记中包含的首字母缩略词(不仅直接,而且在任何时候)?可能吗?

PHP 代码为:

$text = preg_replace("|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU", "$acronym", $文本);

解决方案

您可以使用 PCRE SKIP/FAIL regex 技巧(也适用于 PHP)告诉正则表达式引擎只匹配不在某些分隔符内的内容:

(?s)

.*?<\/pre>(*SKIP)(*F)|\b$acronym\b

这意味着:跳过所有以

开头并以

结尾的子串,然后才匹配$acronym 作为一个整体.

请参阅 regex101.com 上的演示

这是一个示例 PHP 演示:

.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/";$str = "<pre>ASCII\nSometext\nMoretext</pre>更多文本\nASCII\n更多文本<pre>More\nlines\nASCII\nlines</pre>";$subst = "<acronym title=\"$fulltext\">$acronym</acronym>";$result = preg_replace($re, $subst, $str);回声 $result;

输出:

<pre>ASCII</pre><acronym title="美国信息交换标准代码">ASCII</acronym><pre>ASCII</pre>

I am using a WordPress plugin named Acronyms (https://wordpress.org/plugins/acronyms/). This plugin replaces acronyms with their description. It uses a PHP PREG_REPLACE function.

The issue is that it replaces the acronyms contained in a <pre> tag, which I use to present a source code.

Could you modify this expression so that it won't replace acronyms contained inside <pre> tags (not only directly, but in any moment)? Is it possible?

The PHP code is:

$text = preg_replace(
    "|(?!<[^<>]*?)(?<![?.&])\b$acronym\b(?!:)(?![^<>]*?>)|msU"
  , "<acronym title=\"$fulltext\">$acronym</acronym>"
  , $text
);

解决方案

You can use a PCRE SKIP/FAIL regex trick (also works in PHP) to tell the regex engine to only match something if it is not inside some delimiters:

(?s)<pre[^<]*>.*?<\/pre>(*SKIP)(*F)|\b$acronym\b

This means: skip all substrings starting with <pre> and ending with </pre>, and only then match $acronym as a whole word.

See demo on regex101.com

Here is a sample PHP demo:

<?php
$acronym = "ASCII";
$fulltext = "American Standard Code for Information Interchange";
$re = "/(?s)<pre[^<]*>.*?<\\/pre>(*SKIP)(*F)|\\b$acronym\\b/"; 
$str = "<pre>ASCII\nSometext\nMoretext</pre>More text \nASCII\nMore text<pre>More\nlines\nASCII\nlines</pre>"; 
$subst = "<acronym title=\"$fulltext\">$acronym</acronym>"; 
$result = preg_replace($re, $subst, $str);
echo $result;

Output:

<pre>ASCII</pre><acronym title="American Standard Code for Information Interchange">ASCII</acronym><pre>ASCII</pre>

这篇关于PHP 正则表达式不包括 &lt;pre&gt;标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆