普通防爆pression提取PHP code部分((数组定义)) [英] Regular Expression to extract php code partially (( array definition ))

查看:113
本文介绍了普通防爆pression提取PHP code部分((数组定义))的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的PHP code存储((数组定义中))在这样的字符串

I have php code stored (( array definition )) in a string like this

$code=' array(

  0  => "a",
 "a" => $GlobalScopeVar,
 "b" => array("nested"=>array(1,2,3)),  
 "c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },

); ';

有规则的前pression提取此阵?? ,我的意思是我想要的东西,像

there is a regular expression to extract this array??, i mean i want something like

$array=(  

  0  => '"a"',
 'a' => '$GlobalScopeVar',
 'b' => 'array("nested"=>array(1,2,3))',
 'c' => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',

);


物pD ::我做研究试图找到一个正规的前pression但没有被发现。结果
PD2 ::计算器的神,让我现在赏金这一点,我将提供400:3结果
::的pD3这将在内部应用程序,在这里我需要提取一些PHP文件的数组被'处理'的部分要使用的,我试着用这个解释的$c$cpad.org/td6LVVme

推荐答案

甚至当你问一个正则表达式,它的工作原理也与纯PHP。 token_get_all 是这里的关键功能。对于一个正则表达式检查 @哈姆扎的回答的了。

Even when you asked for a regex, it works also with pure PHP. token_get_all is here the key function. For a regex check @HamZa's answer out.

这里的优点是,它比一个正则表达式更有活力。一个正则表达式有一个静态的模式,同时与token_get_all,你可以每一个令牌后做什么决定。它甚至逃逸单引号和反斜杠在必要时,什么是正则表达式不会做。

The advantage here is that it is more dynamic than a regex. A regex has a static pattern, while with token_get_all, you can decide after every single token what to do. It even escapes single quotes and backslashes where necessary, what a regex wouldn't do.

此外,在正则表达式,你有评论甚至当,问题来想象它应该做的;什么code确实是很容易,当你在PHP code理解。

Also, in regex, you have, even when commented, problems to imagine what it should do; what code does is much easier to understand when you look at PHP code.

$code = ' array(

  0  => "a",
  "a" => $GlobalScopeVar,
  "b" => array("nested"=>array(1,2,3)),  
  "c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
  "string_literal",
  12345

); ';

$token = token_get_all("<?php ".$code);
$newcode = "";

$i = 0;
while (++$i < count($token)) { // enter into array; then start.
        if (is_array($token[$i]))
                $newcode .= $token[$i][1];
        else
                $newcode .= $token[$i];

        if ($token[$i] == "(") {
                $ending = ")";
                break;
        }
        if ($token[$i] == "[") {
                $ending = "]";
                break;
        }
}

// init variables
$escape = 0;
$wait_for_non_whitespace = 0;
$parenthesis_count = 0;
$entry = "";

// main loop
while (++$i < count($token)) {
        // don't match commas in func($a, $b)
        if ($token[$i] == "(" || $token[$i] == "{") // ( -> normal parenthesis; { -> closures
                $parenthesis_count++;
        if ($token[$i] == ")" || $token[$i] == "}")
                $parenthesis_count--;

        // begin new string after T_DOUBLE_ARROW
        if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {
                $escape = 1;
                $wait_for_non_whitespace = 0;
                $entry .= "'";
        }

        // here is a T_DOUBLE_ARROW, there will be a string after this
        if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {
                $wait_for_non_whitespace = 1;
        }

        // entry ended: comma reached
        if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending == ")") || ($ending == "]" && $token[$i] == "]")) {
                // go back to the first non-whitespace
                $whitespaces = "";
                if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {
                        $cut_at = strlen($entry);
                        while ($cut_at && ord($entry[--$cut_at]) <= 0x20); // 0x20 == " "
                        $whitespaces = substr($entry, $cut_at + 1, strlen($entry));
                        $entry = substr($entry, 0, $cut_at + 1);
                }

                // $escape == true means: there was somewhere a T_DOUBLE_ARROW
                if ($escape) {
                        $escape = 0;
                        $newcode .= $entry."'";
                } else {
                        $newcode .= "'".addcslashes($entry, "'\\")."'";
                }

                $newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));

                // reset
                $entry = "";
        } else {
                // add actual token to $entry
                if (is_array($token[$i])) {
                        $addChar = $token[$i][1];
                } else {
                        $addChar = $token[$i];
                }

                if ($entry == "" && $token[$i][0] == T_WHITESPACE) {
                        $newcode .= $addChar;
                } else {
                        $entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;
                }
        }
}

//append remaining chars like whitespaces or ;
$newcode .= $entry;

print $newcode;

<大骨节病>演示在: http://3v4l.org/qe4Q1

应该输出:

array(

  0  => '"a"',
  "a" => '$GlobalScopeVar',
  "b" => 'array("nested"=>array(1,2,3))',  
  "c" => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
  '"string_literal"',
  '12345'

) 


可以,拿到阵列的数据,的print_r(的eval(回归$新的code;)); 来获取数组的条目:

Array
(
    [0] => "a"
    [a] => $GlobalScopeVar
    [b] => array("nested"=>array(1,2,3))
    [c] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
    [1] => "string_literal"
    [2] => 12345
)

这篇关于普通防爆pression提取PHP code部分((数组定义))的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆