部分提取php代码的正则表达式((数组定义)) [英] Regular Expression to extract php code partially (( array definition ))

查看:35
本文介绍了部分提取php代码的正则表达式((数组定义))的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这样的字符串中存储了 php 代码((数组定义))

$code=' 数组(0 =>一种",一个"=>$GlobalScopeVar,b"=>数组(嵌套"=> 数组(1,2,3)),c"=>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";},);';

有一个正则表达式来提取这个数组??,我的意思是我想要像

$array=(0 =>'一种"','a' =>'$GlobalScopeVar','b' =>'数组(嵌套"=>数组(1,2,3))','c' =>'function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}',);

<小时>

pD :: 我做了研究,试图找到一个正则表达式,但没有找到.
pD2 :: stackoverflow 的众神,让我现在赏金,我将提供 400 :3
pD3 :: 这将在内部应用程序中使用,在那里我需要提取一些 php 文件的数组以进行部分处理",我尝试用这个 codepad.org/td6LVVme

解决方案

即使您要求使用正则表达式,它也适用于纯 PHP.token_get_all 是这里的关键功能.对于正则表达式,请查看@HamZa 的答案.

这里的优点是它比正则表达式更具动态性.正则表达式具有静态模式,而使用 token_get_all,您可以在每个标记之后决定要做什么.它甚至会在必要时转义单引号和反斜杠,这是正则表达式不会做的.

此外,在正则表达式中,即使在评论时,您也无法想象它应该做什么;当您查看 PHP 代码时,代码的作用更容易理解.

$code = ' 数组(0 =>一种",一个"=>$GlobalScopeVar,b"=>数组(嵌套"=> 数组(1,2,3)),c"=>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";},字符串字面量",12345);';$token = token_get_all("<?php ".$code);$newcode = "";$i = 0;while (++$i < count($token)) {//进入数组;然后开始.如果 (is_array($token[$i]))$newcode .= $token[$i][1];别的$newcode .= $token[$i];if ($token[$i] == "(") {$ending = ")";休息;}if ($token[$i] == "[") {$ending = "]";休息;}}//初始化变量$转义= 0;$wait_for_non_whitespace = 0;$括号计数= 0;$entry = "";//主循环while (++$i < count($token)) {//不要匹配 func($a, $b) 中的逗号if ($token[$i] == "(" || $token[$i] == "{")//( -> 普通括号; { -> 闭包$括号_计数++;if ($token[$i] == ")" ||$token[$i] == "}")$括号计数--;//在 T_DOUBLE_ARROW 之后开始新的字符串if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {$escape = 1;$wait_for_non_whitespace = 0;$entry .= "'";}//这里是一个T_DOUBLE_ARROW,后面会有一个字符串if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {$wait_for_non_whitespace = 1;}//条目结束:到达逗号if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending== ")") ||($ending == "]" && $token[$i] == "]")) {//回到第一个非空白处$whitespaces = "";if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {$cut_at = strlen($entry);而 ($cut_at && ord($entry[--$cut_at]) <= 0x20);//0x20 == " "$whitespaces = substr($entry, $cut_at + 1, strlen($entry));$entry = substr($entry, 0, $cut_at + 1);}//$escape == true 表示:某处有一个 T_DOUBLE_ARROW如果($转义){$转义= 0;$newcode .= $entry."'";} 别的 {$newcode .= "'".addcslashes($entry, "'\\")."'";}$newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));//重启$entry = "";} 别的 {//将实际令牌添加到 $entry如果 (is_array($token[$i])) {$addChar = $token[$i][1];} 别的 {$addChar = $token[$i];}if ($entry == "" && $token[$i][0] == T_WHITESPACE) {$newcode .= $addChar;} 别的 {$entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;}}}//附加剩余的字符,如空格或;$newcode .= $entry;打印 $newcode;

演示地址:http://3v4l.org/qe4Q1

应该输出:

数组(0 =>'一种"',一个"=>'$GlobalScopeVar',b"=>'数组(嵌套"=>数组(1,2,3))',c"=>'function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}','字符串字面量"','12345')

<小时>

你可以,获取数组的数据,print_r(eval("return $newcode;"));获取数组的条目:

数组([0] =>一种"[a] =>$GlobalScopeVar[b] =>数组(嵌套"=> 数组(1,2,3))[c] =>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}[1] =>字符串字面量"[2] =>12345)

I have php code stored (( array definition )) in a string like this

$code=' array(

  0  => "a",
 "a" => $GlobalScopeVar,
 "b" => array("nested"=>array(1,2,3)),  
 "c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },

); ';

there is a regular expression to extract this array??, i mean i want something like

$array=(  

  0  => '"a"',
 'a' => '$GlobalScopeVar',
 'b' => 'array("nested"=>array(1,2,3))',
 'c' => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',

);


pD :: i do research trying to find a regular expression but nothing was found.
pD2 :: gods of stackoverflow, let me bounty this now and i will offer 400 :3
pD3 :: this will be used in a internal app, where i need extract an array of some php file to be 'processed' in parts, i try explain with this codepad.org/td6LVVme

解决方案

Even when you asked for a regex, it works also with pure PHP. token_get_all is here the key function. For a regex check @HamZa's answer out.

The advantage here is that it is more dynamic than a regex. A regex has a static pattern, while with token_get_all, you can decide after every single token what to do. It even escapes single quotes and backslashes where necessary, what a regex wouldn't do.

Also, in regex, you have, even when commented, problems to imagine what it should do; what code does is much easier to understand when you look at PHP code.

$code = ' array(

  0  => "a",
  "a" => $GlobalScopeVar,
  "b" => array("nested"=>array(1,2,3)),  
  "c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
  "string_literal",
  12345

); ';

$token = token_get_all("<?php ".$code);
$newcode = "";

$i = 0;
while (++$i < count($token)) { // enter into array; then start.
        if (is_array($token[$i]))
                $newcode .= $token[$i][1];
        else
                $newcode .= $token[$i];

        if ($token[$i] == "(") {
                $ending = ")";
                break;
        }
        if ($token[$i] == "[") {
                $ending = "]";
                break;
        }
}

// init variables
$escape = 0;
$wait_for_non_whitespace = 0;
$parenthesis_count = 0;
$entry = "";

// main loop
while (++$i < count($token)) {
        // don't match commas in func($a, $b)
        if ($token[$i] == "(" || $token[$i] == "{") // ( -> normal parenthesis; { -> closures
                $parenthesis_count++;
        if ($token[$i] == ")" || $token[$i] == "}")
                $parenthesis_count--;

        // begin new string after T_DOUBLE_ARROW
        if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {
                $escape = 1;
                $wait_for_non_whitespace = 0;
                $entry .= "'";
        }

        // here is a T_DOUBLE_ARROW, there will be a string after this
        if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {
                $wait_for_non_whitespace = 1;
        }

        // entry ended: comma reached
        if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending == ")") || ($ending == "]" && $token[$i] == "]")) {
                // go back to the first non-whitespace
                $whitespaces = "";
                if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {
                        $cut_at = strlen($entry);
                        while ($cut_at && ord($entry[--$cut_at]) <= 0x20); // 0x20 == " "
                        $whitespaces = substr($entry, $cut_at + 1, strlen($entry));
                        $entry = substr($entry, 0, $cut_at + 1);
                }

                // $escape == true means: there was somewhere a T_DOUBLE_ARROW
                if ($escape) {
                        $escape = 0;
                        $newcode .= $entry."'";
                } else {
                        $newcode .= "'".addcslashes($entry, "'\\")."'";
                }

                $newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));

                // reset
                $entry = "";
        } else {
                // add actual token to $entry
                if (is_array($token[$i])) {
                        $addChar = $token[$i][1];
                } else {
                        $addChar = $token[$i];
                }

                if ($entry == "" && $token[$i][0] == T_WHITESPACE) {
                        $newcode .= $addChar;
                } else {
                        $entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;
                }
        }
}

//append remaining chars like whitespaces or ;
$newcode .= $entry;

print $newcode;

Demo at: http://3v4l.org/qe4Q1

Should output:

array(

  0  => '"a"',
  "a" => '$GlobalScopeVar',
  "b" => 'array("nested"=>array(1,2,3))',  
  "c" => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
  '"string_literal"',
  '12345'

) 


You can, to get the array's data, print_r(eval("return $newcode;")); to get the entries of the array:

Array
(
    [0] => "a"
    [a] => $GlobalScopeVar
    [b] => array("nested"=>array(1,2,3))
    [c] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
    [1] => "string_literal"
    [2] => 12345
)

这篇关于部分提取php代码的正则表达式((数组定义))的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆