PHP:正则表达式忽略引号内的转义引号 [英] PHP: Regex to ignore escaped quotes within quotes

查看:99
本文介绍了PHP:正则表达式忽略引号内的转义引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在发布此内容之前,我仔细阅读了相关问题,并且无法修改任何相关答案以使用我的方法(不适用于正则表达式).

I looked through related questions before posting this and I couldn't modify any relevant answers to work with my method (not good at regex).

基本上,这是我现有的行:

Basically, here are my existing lines:

$code = preg_replace_callback( '/"(.*?)"/', array( &$this, '_getPHPString' ), $code );

$code = preg_replace_callback( "#'(.*?)'#", array( &$this, '_getPHPString' ), $code );

它们都匹配''""之间包含的字符串.我需要正则表达式忽略它们之间包含的转义引号.因此,''之间的数据将忽略\',而""之间的数据将忽略\".

They both match strings contained between '' and "". I need the regex to ignore escaped quotes contained between themselves. So data between '' will ignore \' and data between "" will ignore \".

任何帮助将不胜感激.

推荐答案

对于大多数字符串,您需要允许使用转义的任何内容(不仅仅是转义的引号).例如您最有可能需要允许使用"\n""\t"之类的转义字符,当然也要允许转义的转义字符:"\\".

For most strings, you need to allow escaped anything (not just escaped quotes). e.g. you most likely need to allow escaped characters like "\n" and "\t" and of course, the escaped-escape: "\\".

这是一个经常被问到的问题,很早以前就已经解决(并优化)了.杰弗里·弗里德尔(Jeffrey Friedl)在他的经典著作中(例如)深入探讨了这个问题:

This is a frequently asked question, and one which was solved (and optimized) long ago. Jeffrey Friedl covers this question in depth (as an example) in his classic work: Mastering Regular Expressions (3rd Edition). Here is the regex you are looking for:

"([^"\\]|\\.)*"
版本1:工作正常,但效率不高.

"([^"\\]|\\.)*"
Version 1: Works correctly but is not terribly efficient.

"([^"\\]++|\\.)*""((?>[^"\\]+)|\\.)*"
版本2:如果您拥有所有格限定词或原子组,则效率更高(请参阅:使用原子组方法的sin的正确答案).

"([^"\\]++|\\.)*" or "((?>[^"\\]+)|\\.)*"
Version 2: More efficient if you have possessive quantifiers or atomic groups (See: sin's correct answer which uses the atomic group method).

"[^"\\]*(?:\\.[^"\\]*)*"
版本3:效率更高.实现Friedl的展开循环" 技术.不需要所有格或原子组(即可以在Javascript和其他功能较弱的正则表达式引擎中使用.)

"[^"\\]*(?:\\.[^"\\]*)*"
Version 3: More efficient still. Implements Friedl's: "unrolling-the-loop" technique. Does not require possessive or atomic groups (i.e. this can be used in Javascript and other less-featured regex engines.)

以下是PHP语法中建议的双引号和单引号子字符串正则表达式:

Here are the recommended regexes in PHP syntax for both double and single quoted sub-strings:

$re_dq = '/"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"/s';
$re_sq = "/'[^'\\\\]*(?:\\\\.[^'\\\\]*)*'/s";

这篇关于PHP:正则表达式忽略引号内的转义引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆