正则表达式中的preg_match函数返回浏览器错误 [英] RegExp in preg_match function returning browser error
问题描述
下面的函数与我的$模式变量所提供的正则表达式打破。如果我改变了正则表达式我很好,所以我认为这就是问题所在。我没有看到这个问题,不过,我不会接受,即使他们打开一个标准的PHP错误。
函数parseAPIResults($结果){
//注意到从getAPIResults结果,返回数组。 $模式='/\\[(.|\
)+\\]/';
$ resultsArray = preg_match($模式,$的结果,$匹配);}
火狐6:连接被重置
14铬:错误101(净值:: ERR_CONNECTION_RESET):连接是
复位。
IE 8:Internet Explorer无法显示该网页
块引用>更新:
阿帕奇/ PHP可能会崩溃。下面是当我运行该脚本Apache的错误日志:
[星期六10月1日11点41分40秒2011] [声明]家长:子进程退出,
状态255 - 重新启动。
[周六10月1日11点41分四十〇秒2011] [声明]
阿帕奇/ 2.2.11(的Win32)PHP / 5.3.0配置 - 恢复正常
操作
块引用>在Windows 7上运行WAMP 2.0。
解决方案简单的问题。复杂的答案!
是的,这个类的正则表达式会重复地(默默)与未处理的分段错误而崩溃的Apache / PHP由于堆栈溢出!
背景:
在PHP
$ P $皮克_ *
系列的正则表达式功能使用强大的 PCRE库由菲利普·黑兹尔。有了这个图书馆,有一定的阶级正则表达式需要大量的递归调用其内部的匹配()
的功能,这占用了大量的堆栈空间,(和使用堆栈空间相匹配成正比的目标字符串的大小)。因此,如果目标字符串过长,会导致堆栈溢出,并相应分段错误。这种行为在 PCRE文档在年底根据标题为部分中描述:的 pcrestack 。PHP错误1:PHP设置:
pcre.recursion_limit
过大在PCRE文档描述了如何通过限制递归深度安全值约等于500分链接的应用程序的堆栈大小,以避免堆栈溢出段故障当递归深度适当限制的建议,图书馆不会产生一个堆栈溢出,而是优雅地一个错误code退出。在PHP中,这个最大递归深度与
pcre.recursion_limit
配置变量和(不幸)的默认值设为10万指定。 此值过大这是pcre.recursion_limit
安全值的表,适用于各种可执行堆栈大小:STACKSIZE pcre.recursion_limit
64 MB 134217
32 MB 67108
16 MB 33554
8 MB 16777
4 MB 8388
2 MB 4194
1 MB 2097
512 KB 1048
256 KB 524因此,对于Apache网络服务器(
httpd.exe
),其中有256KB的(比较小)堆栈大小,<正确的价值的Win32版本code> pcre.recursion_limit 应设置为524这可以用下面的行PHP code来完成:的ini_set(pcre.recursion_limit,524); // PHP默认值是100,000。
在此code被添加到PHP脚本,堆栈溢出没有发生,而是产生一个有意义的错误code。也就是说,它的应该的产生错误code! (但不幸的是,由于其他PHP错误,
preg_match()
没有。)PHP错误2:
preg_match()
上的错误不会返回FALSE为
preg_match()
PHP的文件说,它返回出错FALSE。不幸的是,PHP版本5.3.3及以下有一个bug(#52732 ),其中preg_match()
不返回FALSE
上的错误(它,而不是返回INT(0)
,这是一个不匹配的情况下,返回的值相同)。此错误是固定在PHP 5.3.4版解决方案:
假设你将继续使用WAMP 2.0(使用PHP 5.3.0)的解决方案需要上述两个漏洞的考虑。以下是我建议:
- 需要降低
pcre.recursion_limit
在安全值:524- 需要明确检查一个PCRE错误,每当
preg_match()
返回以外的任何其他INT(1)
。- 如果
preg_match()
收益INT(1)
,则匹配成功。- 如果
preg_match()
收益INT(0)
,那么这场比赛是不是没有成功,或有错误。下面是脚本的修改版本(旨在从命令行中运行),确定了导致递归限制错误的主题字符串长度:
&LT; PHP
//这个测试脚本的设计是在命令行中运行。
//它的措施,结果在一个主题字符串长度
//在preg_match preG_RECURSION_LIMIT_ERROR错误()函数。回声(输入test.php的... \\ n);//设置和显示pcre.recursion_limit。 (设定为STACKSIZE / 500)。
在Win32下// httpd.exe有php.exe的一个堆栈= 256KB和8MB。
//ini_set(\"pcre.recursion_limit,524); // STACKSIZE = 256KB。
的ini_set(pcre.recursion_limit,16777); // STACKSIZE = 8MB。
回声(sprintf的(PCRE pcre.recursion_limit设置到%s \\ n,
ini_get(pcre.recursion_limit)));功能parseAPIResults($结果){
$模式=/\\[(.|\
)+\\]/;
$ resultsArray = preg_match($模式,$的结果,$匹配);
如果($ resultsArray === 1){
$味精='匹配成功。;
}其他{
//无论是不成功的比赛,或者PCRE出错。
$ pcre_err = preg_last_error(); // PHP 5.2及以上。
如果($ pcre_err === preG_NO_ERROR){
$味精='成功不匹配。;
}其他{
// preg_match错误!
开关($ pcre_err){
案例preG_INTERNAL_ERROR:
$味精='preG_INTERNAL_ERROR';
打破;
案例preG_BACKTRACK_LIMIT_ERROR:
$味精='preG_BACKTRACK_LIMIT_ERROR';
打破;
案例preG_RECURSION_LIMIT_ERROR:
$味精='preG_RECURSION_LIMIT_ERROR';
打破;
案例preG_BAD_UTF8_ERROR:
$味精='preG_BAD_UTF8_ERROR';
打破;
案例preG_BAD_UTF8_OFFSET_ERROR:
$味精='preG_BAD_UTF8_OFFSET_ERROR';
打破;
默认:
$味精='无法识别的preG错误;
打破;
}
}
}
回报($味精);
}//建设规模日益扩大的匹配测试字符串。
功能buildTestString(){
静态内容$ =;
$内容=A。
回归'['。 ']'$内容。
}//查找导致错误主题字符串的长度。
为(;;){//无限循环。爆发。
$海峡= buildTestString();
$味精= parseAPIResults($海峡);
的printf(长度=%10D \\ R,strlen的($ STR));
如果($味精=='匹配成功。'!)打破;
}回声(sprintf的(\\ nPCRE_ERROR = \\%s \\的主题在字符串长度=%d个\\ N,
$味精,strlen的($ STR)));回声(退出test.php的......);?&GT;当您运行此脚本,它提供了目标字符串的当前长度的连续读数。如果
pcre.recursion_limit
留在其过高的默认值,这可以让你衡量字符串的长度,导致可执行崩溃。评论:
- 的调查回答这个问题之前,我不知道PHP错误,其中
preg_match()
未能返回FALSE
时,在PCRE库发生错误。这个bug的确令人质疑code的大量使用preg_match
! (我当然会尽我自己的PHP code的清单。)- 在Windows中,Apache网络服务器可执行文件(
httpd.exe
)是建立与256KB的堆栈大小。 PHP的命令行可执行文件(的php.exe
)是建立与8MB的堆栈大小。为pcre.recursion_limit
安全值应按照该脚本正处于(524和16777分别)运行可执行文件进行设置。- 在* nix系统中,Apache网络服务器和命令行可执行文件都通常与8MB的堆栈大小而建,所以没有遇到经常这个问题。
- 的PHP开发人员应
pcre.recursion_limit
的默认值设定为安全值。- 的PHP开发人员应适用
preg_match()
漏洞修复到PHP 5.2版本。- 一个Windows可执行文件的堆栈大小可以使用 CFF Explorer中免费程序进行手动修改。你可以使用这个程序来增加Apache的
httpd.exe
可执行文件的堆栈大小。 (这工作在XP,但Vista和Win7的可能会抱怨。)The following function breaks with the regexp I've provided in the $pattern variable. If I change the regexp I'm fine, so I think that's the problem. I'm not seeing the problem, though, and I'm not receiving a standard PHP error even though they're turned on.
function parseAPIResults($results){ //Takes results from getAPIResults, returns array. $pattern = '/\[(.|\n)+\]/'; $resultsArray = preg_match($pattern, $results, $matches); }
Firefox 6: The connection was reset
Chrome 14: Error 101 (net::ERR_CONNECTION_RESET): The connection was reset.
IE 8: Internet Explorer cannot display the webpage
UPDATE:
Apache/PHP may be crashing. Here's the Apache error log from when I run the script:[Sat Oct 01 11:41:40 2011] [notice] Parent: child process exited with status 255 -- Restarting.
[Sat Oct 01 11:41:40 2011] [notice] Apache/2.2.11 (Win32) PHP/5.3.0 configured -- resuming normal operationsRunning WAMP 2.0 on Windows 7.
解决方案Simple question. Complex answer!
Yes, this class of regex will repeatably (and silently) crash Apache/PHP with an unhandled segmentation fault due to a stack overflow!
Background:
The PHP
preg_*
family of regex functions use the powerful PCRE library by Philip Hazel. With this library, there is a certain class of regex which requires lots of recursive calls to its internalmatch()
function and this uses up a lot of stack space, (and the stack space used is directly proportional to the size of the subject string being matched). Thus, if the subject string is too long, a stack overflow and corresponding segmentation fault will occur. This behavior is described in the PCRE documentation at the end under the section titled: pcrestack.PHP Bug 1: PHP sets:
pcre.recursion_limit
too large.The PCRE documentation describes how to avoid a stack overflow segmentation fault by limiting the recursion depth to a safe value roughly equal to the stack size of the linked application divided by 500. When the recursion depth is properly limited as recommended, the library does not generate a stack overflow and instead gracefully exits with an error code. Under PHP, this maximum recursion depth is specified with the
pcre.recursion_limit
configuration variable and (unfortunately) the default value is set to 100,000. This value is TOO BIG! Here is a table of safe values ofpcre.recursion_limit
for a variety of executable stack sizes:Stacksize pcre.recursion_limit 64 MB 134217 32 MB 67108 16 MB 33554 8 MB 16777 4 MB 8388 2 MB 4194 1 MB 2097 512 KB 1048 256 KB 524
Thus, for the Win32 build of the Apache webserver (
httpd.exe
), which has a (relatively small) stack size of 256KB, the correct value ofpcre.recursion_limit
should be set to 524. This can be accomplished with the following line of PHP code:ini_set("pcre.recursion_limit", "524"); // PHP default is 100,000.
When this code is added to the PHP script, the stack overflow does NOT occur, but instead generates a meaningful error code. That is, it SHOULD generate an error code! (But unfortunately, due to another PHP bug,
preg_match()
does not.)PHP Bug 2:
preg_match()
does not return FALSE on error.The PHP documentation for
preg_match()
says that it returns FALSE on error. Unfortunately, PHP versions 5.3.3 and below have a bug (#52732) wherepreg_match()
does NOT returnFALSE
on error (it instead returnsint(0)
, which is the same value returned in the case of a non-match). This bug was fixed in PHP version 5.3.4.Solution:
Assuming you will continue using WAMP 2.0 (with PHP 5.3.0) the solution needs to take both of the above bugs into consideration. Here is what I would recommend:
- Need to reduce
pcre.recursion_limit
to a safe value: 524.- Need to explicitly check for a PCRE error whenever
preg_match()
returns anything other thanint(1)
.- If
preg_match()
returnsint(1)
, then the match was successful.- If
preg_match()
returnsint(0)
, then the match was either not successful, or there was an error.Here is a modified version of your script (designed to be run from the command line) that determines the subject string length that results in the recursion limit error:
<?php // This test script is designed to be run from the command line. // It measures the subject string length that results in a // PREG_RECURSION_LIMIT_ERROR error in the preg_match() function. echo("Entering TEST.PHP...\n"); // Set and display pcre.recursion_limit. (set to stacksize / 500). // Under Win32 httpd.exe has a stack = 256KB and 8MB for php.exe. //ini_set("pcre.recursion_limit", "524"); // Stacksize = 256KB. ini_set("pcre.recursion_limit", "16777"); // Stacksize = 8MB. echo(sprintf("PCRE pcre.recursion_limit is set to %s\n", ini_get("pcre.recursion_limit"))); function parseAPIResults($results){ $pattern = "/\[(.|\n)+\]/"; $resultsArray = preg_match($pattern, $results, $matches); if ($resultsArray === 1) { $msg = 'Successful match.'; } else { // Either an unsuccessful match, or a PCRE error occurred. $pcre_err = preg_last_error(); // PHP 5.2 and above. if ($pcre_err === PREG_NO_ERROR) { $msg = 'Successful non-match.'; } else { // preg_match error! switch ($pcre_err) { case PREG_INTERNAL_ERROR: $msg = 'PREG_INTERNAL_ERROR'; break; case PREG_BACKTRACK_LIMIT_ERROR: $msg = 'PREG_BACKTRACK_LIMIT_ERROR'; break; case PREG_RECURSION_LIMIT_ERROR: $msg = 'PREG_RECURSION_LIMIT_ERROR'; break; case PREG_BAD_UTF8_ERROR: $msg = 'PREG_BAD_UTF8_ERROR'; break; case PREG_BAD_UTF8_OFFSET_ERROR: $msg = 'PREG_BAD_UTF8_OFFSET_ERROR'; break; default: $msg = 'Unrecognized PREG error'; break; } } } return($msg); } // Build a matching test string of increasing size. function buildTestString() { static $content = ""; $content .= "A"; return '['. $content .']'; } // Find subject string length that results in error. for (;;) { // Infinite loop. Break out. $str = buildTestString(); $msg = parseAPIResults($str); printf("Length =%10d\r", strlen($str)); if ($msg !== 'Successful match.') break; } echo(sprintf("\nPCRE_ERROR = \"%s\" at subject string length = %d\n", $msg, strlen($str))); echo("Exiting TEST.PHP..."); ?>
When you run this script, it provides a continuous readout of the current length of the subject string. If the
pcre.recursion_limit
is left at its too high default value, this allows you to measure the length of string that causes the executable to crash.Comments:
- Before investigating the answer to this question, I didn't know about PHP bug where
preg_match()
fails to returnFALSE
when an error occurs in the PCRE library. This bug certainly calls into question a LOT of code that usespreg_match
! (I'm certainly going to do an inventory of my own PHP code.)- Under Windows, the Apache webserver executable (
httpd.exe
) is built with a stacksize of 256KB. The PHP command line executable (php.exe
) is built with a stacksize of 8MB. The safe value forpcre.recursion_limit
should be set in accordance with the executable that the script is being run under (524 and 16777 respectively).- Under *nix systems, the Apache webserver and command line executables are both typically built with a stacksize of 8MB, so this problem is not encountered as often.
- The PHP developers should set the default value of
pcre.recursion_limit
to a safe value.- The PHP developers should apply the
preg_match()
bugfix to PHP version 5.2.- The stacksize of a Windows executable can be manually modified using the CFF Explorer freeware program. You can use this program to increase the stacksize of the Apache
httpd.exe
executable. (This works under XP but Vista and Win7 might complain.)这篇关于正则表达式中的preg_match函数返回浏览器错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!