正则表达式中的preg_match函数返回浏览器错误 [英] RegExp in preg_match function returning browser error

查看:478
本文介绍了正则表达式中的preg_match函数返回浏览器错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的函数与我的$模式变量所提供的正则表达式打破。如果我改变了正则表达式我很好,所以我认为这就是问题所在。我没有看到这个问题,不过,我不会接受,即使他们打开一个标准的PHP错误。

 函数parseAPIResults($结果){
//注意到从getAPIResults结果,返回数组。    $模式='/\\[(.|\
)+\\]/';
    $ resultsArray = preg_match($模式,$的结果,$匹配);}


  

火狐6:连接被重置


  
  

14铬:错误101(净值:: ERR_CONNECTION_RESET):连接是
  复位。


  
  

IE 8:Internet Explorer无法显示该网页


更新:

阿帕奇/ PHP可能会崩溃。下面是当我运行该脚本Apache的错误日志:


  

[星期六10月1日11点41分40秒2011] [声明]家长:子进程退出,
  状态255 - 重新启动。

  [周六10月1日11点41分四十〇秒2011] [声明]
  阿帕奇/ 2.2.11(的Win32)PHP / 5.3.0配置 - 恢复正常
  操作
  


在Windows 7上运行WAMP 2.0。


解决方案

简单的问题。复杂的答案!

是的,这个类的正则表达式会重复地(默默)与未处理的分段错误而崩溃的Apache / PHP由于堆栈溢出!

背景:

在PHP $ P $皮克_ * 系列的正则表达式功能使用强大的 PCRE库由菲利普·黑兹尔。有了这个图书馆,有一定的阶级正则表达式需要大量的递归调用其内部的匹配()的功能,这占用了大量的堆栈空间,(和使用堆栈空间相匹配成正比的目标字符串的大小)。因此,如果目标字符串过长,会导致堆栈溢出,并相应分段错误。这种行为在 PCRE文档在年底根据标题为部分中描述:的 pcrestack

PHP错误1:PHP设置: pcre.recursion_limit 过大

在PCRE文档描述了如何通过限制递归深度安全值约等于500分链接的应用程序的堆栈大小,以避免堆栈溢出段故障当递归深度适当限制的建议,图书馆不会产生一个堆栈溢出,而是优雅地一个错误code退出。在PHP中,这个最大递归深度与 pcre.recursion_limit 配置变量和(不幸)的默认值设为10万指定。 此值过大这是 pcre.recursion_limit 安全值的表,适用于各种可执行堆栈大小:

  STACKSIZE pcre.recursion_limit
 64 MB 134217
 32 MB 67108
 16 MB 33554
  8 MB 16777
  4 MB 8388
  2 MB 4194
  1 MB 2097
512 KB 1048
256 KB 524

因此​​,对于Apache网络服务器( httpd.exe ),其中有256KB的(比较小)堆栈大小,<正确的价值的Win32版本code> pcre.recursion_limit 应设置为524这可以用下面的行PHP code来完成:

的ini_set(pcre.recursion_limit,524); // PHP默认值是100,000。

在此code被添加到PHP脚本,堆栈溢出没有发生,而是产生一个有意义的错误code。也就是说,它的应该的产生错误code! (但不幸的是,由于其他PHP错误, preg_match()没有。​​)

PHP错误2: preg_match()上的错误不会返回FALSE

preg_match() PHP的文件说,它返回出错FALSE。不幸的是,PHP版本5.3.3及以下有一个bug(#52732 ),其中 preg_match()不返回 FALSE 上的错误(它,而不是返回 INT(0),这是一个不匹配的情况下,返回的值相同)。此错误是固定在PHP 5.3.4版

解决方案:

假设你将继续使用WAMP 2.0(使用PHP 5.3.0)的解决方案需要上述两个漏洞的考虑。以下是我建议:


  • 需要降低 pcre.recursion_limit 在安全值:524

  • 需要明确检查一个PCRE错误,每当 preg_match()返回以外的任何其他 INT(1)

  • 如果 preg_match()收益 INT(1),则匹配成功。

  • 如果 preg_match()收益 INT(0),那么这场比赛是不是没有成功,或有错误。

下面是脚本的修改版本(旨在从命令行中运行),确定了导致递归限制错误的主题字符串长度:

&LT; PHP
//这个测​​试脚本的设计是在命令行中运行。
//它的措施,结果在一个主题字符串长度
//在preg_match preG_RECURSION_LIMIT_ERROR错误()函数。回声(输入test.php的... \\ n);//设置和显示pcre.recursion_limit。 (设定为STACKSIZE / 500)。
在Win32下// httpd.exe有php.exe的一个堆栈= 256KB和8MB。
//ini_set(\"pcre.recursion_limit,524); // STACKSIZE = 256KB。
的ini_set(pcre.recursion_limit,16777); // STACKSIZE = 8MB。
回声(sprintf的(PCRE pcre.recursion_limit设置到%s \\ n,
    ini_get(pcre.recursion_limit)));功能parseAPIResults($结果){
    $模式=/\\[(.|\
)+\\]/;
    $ resultsArray = preg_match($模式,$的结果,$匹配);
    如果($ resultsArray === 1){
        $味精='匹配成功。;
    }其他{
        //无论是不成功的比赛,或者PCRE出错。
        $ pcre_err = preg_last_error(); // PHP 5.2及以上。
        如果($ pcre_err === preG_NO_ERROR){
            $味精='成功不匹配。;
        }其他{
            // preg_match错误!
            开关($ pcre_err){
                案例preG_INTERNAL_ERROR:
                    $味精='preG_INTERNAL_ERROR';
                    打破;
                案例preG_BACKTRACK_LIMIT_ERROR:
                    $味精='preG_BACKTRACK_LIMIT_ERROR';
                    打破;
                案例preG_RECURSION_LIMIT_ERROR:
                    $味精='preG_RECURSION_LIMIT_ERROR';
                    打破;
                案例preG_BAD_UTF8_ERROR:
                    $味精='preG_BAD_UTF8_ERROR';
                    打破;
                案例preG_BAD_UTF8_OFFSET_ERROR:
                    $味精='preG_BAD_UTF8_OFFSET_ERROR';
                    打破;
                默认:
                    $味精='无法识别的preG错误;
                    打破;
            }
        }
    }
    回报($味精);
}//建设规模日益扩大的匹配测试字符串。
功能buildTestString(){
    静态内容$ =;
    $内容=A。
    回归'['。 ']'$内容。
}//查找导致错误主题字符串的长度。
为(;;){//无限循环。爆发。
    $海峡= buildTestString();
    $味精= parseAPIResults($海峡);
    的printf(长度=%10D \\ R,strlen的($ STR));
    如果($味精=='匹配成功。'!)打破;
}回声(sprintf的(\\ nPCRE_ERROR = \\%s \\的主题在字符串长度=%d个\\ N,
    $味精,strlen的($ STR)));回声(退出test.php的......);?&GT;

当您运行此脚本,它提供了目标字符串的当前长度的连续读数。如果 pcre.recursion_limit 留在其过高的默认值,这可以让你衡量字符串的长度,导致可执行崩溃。

评论:


  • 的调查回答这个问题之前,我不知道PHP错误,其中 preg_match()未能返回 FALSE 时,在PCRE库发生错误。这个bug的确令人质疑code的大量使用 preg_match ! (我当然会尽我自己的PHP code的清单。)

  • 在Windows中,Apache网络服务器可执行文件( httpd.exe )是建立与256KB的堆栈大小。 PHP的命令行可执行文件(的php.exe )是建立与8MB的堆栈大小。为 pcre.recursion_limit 安全值应按照​​该脚本正处于(524和16777分别)运行可执行文件进行设置。

  • 在* nix系统中,Apache网络服务器和命令行可执行文件都通常与8MB的堆栈大小而建,所​​以没有遇到经常这个问题。

  • 的PHP开发人员应 pcre.recursion_limit 的默认值设定为安全值。

  • 的PHP开发人员应适用 preg_match()漏洞修复到PHP 5.2版本。

  • 一个Windows可执行文件的堆栈大小可以使用 CFF Explorer中免费程序进行手动修改。你可以使用这个程序来增加Apache的 httpd.exe 可执行文件的堆栈大小。 (这工作在XP,但Vista和Win7的可能会抱怨。)

The following function breaks with the regexp I've provided in the $pattern variable. If I change the regexp I'm fine, so I think that's the problem. I'm not seeing the problem, though, and I'm not receiving a standard PHP error even though they're turned on.

function parseAPIResults($results){
//Takes results from getAPIResults, returns array.

    $pattern = '/\[(.|\n)+\]/';
    $resultsArray = preg_match($pattern, $results, $matches);

}

Firefox 6: The connection was reset

Chrome 14: Error 101 (net::ERR_CONNECTION_RESET): The connection was reset.

IE 8: Internet Explorer cannot display the webpage

UPDATE:
Apache/PHP may be crashing. Here's the Apache error log from when I run the script:

[Sat Oct 01 11:41:40 2011] [notice] Parent: child process exited with status 255 -- Restarting.
[Sat Oct 01 11:41:40 2011] [notice] Apache/2.2.11 (Win32) PHP/5.3.0 configured -- resuming normal operations

Running WAMP 2.0 on Windows 7.

解决方案

Simple question. Complex answer!

Yes, this class of regex will repeatably (and silently) crash Apache/PHP with an unhandled segmentation fault due to a stack overflow!

Background:

The PHP preg_* family of regex functions use the powerful PCRE library by Philip Hazel. With this library, there is a certain class of regex which requires lots of recursive calls to its internal match() function and this uses up a lot of stack space, (and the stack space used is directly proportional to the size of the subject string being matched). Thus, if the subject string is too long, a stack overflow and corresponding segmentation fault will occur. This behavior is described in the PCRE documentation at the end under the section titled: pcrestack.

PHP Bug 1: PHP sets: pcre.recursion_limit too large.

The PCRE documentation describes how to avoid a stack overflow segmentation fault by limiting the recursion depth to a safe value roughly equal to the stack size of the linked application divided by 500. When the recursion depth is properly limited as recommended, the library does not generate a stack overflow and instead gracefully exits with an error code. Under PHP, this maximum recursion depth is specified with the pcre.recursion_limit configuration variable and (unfortunately) the default value is set to 100,000. This value is TOO BIG! Here is a table of safe values of pcre.recursion_limit for a variety of executable stack sizes:

Stacksize   pcre.recursion_limit
 64 MB      134217
 32 MB      67108
 16 MB      33554
  8 MB      16777
  4 MB      8388
  2 MB      4194
  1 MB      2097
512 KB      1048
256 KB      524

Thus, for the Win32 build of the Apache webserver (httpd.exe), which has a (relatively small) stack size of 256KB, the correct value of pcre.recursion_limit should be set to 524. This can be accomplished with the following line of PHP code:

ini_set("pcre.recursion_limit", "524"); // PHP default is 100,000.

When this code is added to the PHP script, the stack overflow does NOT occur, but instead generates a meaningful error code. That is, it SHOULD generate an error code! (But unfortunately, due to another PHP bug, preg_match() does not.)

PHP Bug 2: preg_match() does not return FALSE on error.

The PHP documentation for preg_match() says that it returns FALSE on error. Unfortunately, PHP versions 5.3.3 and below have a bug (#52732) where preg_match() does NOT return FALSE on error (it instead returns int(0), which is the same value returned in the case of a non-match). This bug was fixed in PHP version 5.3.4.

Solution:

Assuming you will continue using WAMP 2.0 (with PHP 5.3.0) the solution needs to take both of the above bugs into consideration. Here is what I would recommend:

  • Need to reduce pcre.recursion_limit to a safe value: 524.
  • Need to explicitly check for a PCRE error whenever preg_match() returns anything other than int(1).
  • If preg_match() returns int(1), then the match was successful.
  • If preg_match() returns int(0), then the match was either not successful, or there was an error.

Here is a modified version of your script (designed to be run from the command line) that determines the subject string length that results in the recursion limit error:

<?php
// This test script is designed to be run from the command line.
// It measures the subject string length that results in a
// PREG_RECURSION_LIMIT_ERROR error in the preg_match() function.

echo("Entering TEST.PHP...\n");

// Set and display pcre.recursion_limit. (set to stacksize / 500).
// Under Win32 httpd.exe has a stack = 256KB and 8MB for php.exe.
//ini_set("pcre.recursion_limit", "524");       // Stacksize = 256KB.
ini_set("pcre.recursion_limit", "16777");   // Stacksize = 8MB.
echo(sprintf("PCRE pcre.recursion_limit is set to %s\n",
    ini_get("pcre.recursion_limit")));

function parseAPIResults($results){
    $pattern = "/\[(.|\n)+\]/";
    $resultsArray = preg_match($pattern, $results, $matches);
    if ($resultsArray === 1) {
        $msg = 'Successful match.';
    } else {
        // Either an unsuccessful match, or a PCRE error occurred.
        $pcre_err = preg_last_error();  // PHP 5.2 and above.
        if ($pcre_err === PREG_NO_ERROR) {
            $msg = 'Successful non-match.';
        } else {
            // preg_match error!
            switch ($pcre_err) {
                case PREG_INTERNAL_ERROR:
                    $msg = 'PREG_INTERNAL_ERROR';
                    break;
                case PREG_BACKTRACK_LIMIT_ERROR:
                    $msg = 'PREG_BACKTRACK_LIMIT_ERROR';
                    break;
                case PREG_RECURSION_LIMIT_ERROR:
                    $msg = 'PREG_RECURSION_LIMIT_ERROR';
                    break;
                case PREG_BAD_UTF8_ERROR:
                    $msg = 'PREG_BAD_UTF8_ERROR';
                    break;
                case PREG_BAD_UTF8_OFFSET_ERROR:
                    $msg = 'PREG_BAD_UTF8_OFFSET_ERROR';
                    break;
                default:
                    $msg = 'Unrecognized PREG error';
                    break;
            }
        }
    }
    return($msg);
}

// Build a matching test string of increasing size.
function buildTestString() {
    static $content = "";
    $content .= "A";
    return '['. $content .']';
}

// Find subject string length that results in error.
for (;;) { // Infinite loop. Break out.
    $str = buildTestString();
    $msg = parseAPIResults($str);
    printf("Length =%10d\r", strlen($str));
    if ($msg !== 'Successful match.') break;
}

echo(sprintf("\nPCRE_ERROR = \"%s\" at subject string length = %d\n",
    $msg, strlen($str)));

echo("Exiting TEST.PHP...");

?>

When you run this script, it provides a continuous readout of the current length of the subject string. If the pcre.recursion_limit is left at its too high default value, this allows you to measure the length of string that causes the executable to crash.

Comments:

  • Before investigating the answer to this question, I didn't know about PHP bug where preg_match() fails to return FALSE when an error occurs in the PCRE library. This bug certainly calls into question a LOT of code that uses preg_match! (I'm certainly going to do an inventory of my own PHP code.)
  • Under Windows, the Apache webserver executable (httpd.exe) is built with a stacksize of 256KB. The PHP command line executable (php.exe) is built with a stacksize of 8MB. The safe value for pcre.recursion_limit should be set in accordance with the executable that the script is being run under (524 and 16777 respectively).
  • Under *nix systems, the Apache webserver and command line executables are both typically built with a stacksize of 8MB, so this problem is not encountered as often.
  • The PHP developers should set the default value of pcre.recursion_limit to a safe value.
  • The PHP developers should apply the preg_match() bugfix to PHP version 5.2.
  • The stacksize of a Windows executable can be manually modified using the CFF Explorer freeware program. You can use this program to increase the stacksize of the Apache httpd.exe executable. (This works under XP but Vista and Win7 might complain.)

这篇关于正则表达式中的preg_match函数返回浏览器错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆