部分提取php代码的正则表达式((数组定义)) [英] Regular Expression to extract php code partially (( array definition ))
问题描述
我在这样的字符串中存储了 php 代码((数组定义))
$code=' 数组(0 =>一种",一个"=>$GlobalScopeVar,b"=>数组(嵌套"=> 数组(1,2,3)),c"=>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";},);';
有一个正则表达式来提取这个数组??,我的意思是我想要像
$array=(0 =>'一种"','a' =>'$GlobalScopeVar','b' =>'数组(嵌套"=>数组(1,2,3))','c' =>'function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}',);
<小时>
pD :: 我做了研究,试图找到一个正则表达式,但没有找到.
pD2 :: stackoverflow 的众神,让我现在赏金,我将提供 400 :3
pD3 :: 这将在内部应用程序中使用,在那里我需要提取一些 php 文件的数组以进行部分处理",我尝试用这个 codepad.org/td6LVVme
即使您要求使用正则表达式,它也适用于纯 PHP.token_get_all
是这里的关键功能.对于正则表达式,请查看@HamZa 的答案.
这里的优点是它比正则表达式更具动态性.正则表达式具有静态模式,而使用 token_get_all,您可以在每个标记之后决定要做什么.它甚至会在必要时转义单引号和反斜杠,这是正则表达式不会做的.
此外,在正则表达式中,即使在评论时,您也无法想象它应该做什么;当您查看 PHP 代码时,代码的作用更容易理解.
$code = ' 数组(0 =>一种",一个"=>$GlobalScopeVar,b"=>数组(嵌套"=> 数组(1,2,3)),c"=>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";},字符串字面量",12345);';$token = token_get_all("<?php ".$code);$newcode = "";$i = 0;while (++$i < count($token)) {//进入数组;然后开始.如果 (is_array($token[$i]))$newcode .= $token[$i][1];别的$newcode .= $token[$i];if ($token[$i] == "(") {$ending = ")";休息;}if ($token[$i] == "[") {$ending = "]";休息;}}//初始化变量$转义= 0;$wait_for_non_whitespace = 0;$括号计数= 0;$entry = "";//主循环while (++$i < count($token)) {//不要匹配 func($a, $b) 中的逗号if ($token[$i] == "(" || $token[$i] == "{")//( -> 普通括号; { -> 闭包$括号_计数++;if ($token[$i] == ")" ||$token[$i] == "}")$括号计数--;//在 T_DOUBLE_ARROW 之后开始新的字符串if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {$escape = 1;$wait_for_non_whitespace = 0;$entry .= "'";}//这里是一个T_DOUBLE_ARROW,后面会有一个字符串if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {$wait_for_non_whitespace = 1;}//条目结束:到达逗号if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending== ")") ||($ending == "]" && $token[$i] == "]")) {//回到第一个非空白处$whitespaces = "";if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {$cut_at = strlen($entry);而 ($cut_at && ord($entry[--$cut_at]) <= 0x20);//0x20 == " "$whitespaces = substr($entry, $cut_at + 1, strlen($entry));$entry = substr($entry, 0, $cut_at + 1);}//$escape == true 表示:某处有一个 T_DOUBLE_ARROW如果($转义){$转义= 0;$newcode .= $entry."'";} 别的 {$newcode .= "'".addcslashes($entry, "'\\")."'";}$newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));//重启$entry = "";} 别的 {//将实际令牌添加到 $entry如果 (is_array($token[$i])) {$addChar = $token[$i][1];} 别的 {$addChar = $token[$i];}if ($entry == "" && $token[$i][0] == T_WHITESPACE) {$newcode .= $addChar;} 别的 {$entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;}}}//附加剩余的字符,如空格或;$newcode .= $entry;打印 $newcode;
演示地址:http://3v4l.org/qe4Q1>
应该输出:
数组(0 =>'一种"',一个"=>'$GlobalScopeVar',b"=>'数组(嵌套"=>数组(1,2,3))',c"=>'function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}','字符串字面量"','12345')
<小时>
你可以,获取数组的数据,print_r(eval("return $newcode;"));
获取数组的条目:
数组([0] =>一种"[a] =>$GlobalScopeVar[b] =>数组(嵌套"=> 数组(1,2,3))[c] =>function() use (&$VAR) { return isset($VAR) ?已定义":未定义";}[1] =>字符串字面量"[2] =>12345)
I have php code stored (( array definition )) in a string like this
$code=' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
); ';
there is a regular expression to extract this array??, i mean i want something like
$array=(
0 => '"a"',
'a' => '$GlobalScopeVar',
'b' => 'array("nested"=>array(1,2,3))',
'c' => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
);
pD :: i do research trying to find a regular expression but nothing was found.
pD2 :: gods of stackoverflow, let me bounty this now and i will offer 400 :3
pD3 :: this will be used in a internal app, where i need extract an array of some php file to be 'processed' in parts, i try explain with this codepad.org/td6LVVme
Even when you asked for a regex, it works also with pure PHP. token_get_all
is here the key function. For a regex check @HamZa's answer out.
The advantage here is that it is more dynamic than a regex. A regex has a static pattern, while with token_get_all, you can decide after every single token what to do. It even escapes single quotes and backslashes where necessary, what a regex wouldn't do.
Also, in regex, you have, even when commented, problems to imagine what it should do; what code does is much easier to understand when you look at PHP code.
$code = ' array(
0 => "a",
"a" => $GlobalScopeVar,
"b" => array("nested"=>array(1,2,3)),
"c" => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; },
"string_literal",
12345
); ';
$token = token_get_all("<?php ".$code);
$newcode = "";
$i = 0;
while (++$i < count($token)) { // enter into array; then start.
if (is_array($token[$i]))
$newcode .= $token[$i][1];
else
$newcode .= $token[$i];
if ($token[$i] == "(") {
$ending = ")";
break;
}
if ($token[$i] == "[") {
$ending = "]";
break;
}
}
// init variables
$escape = 0;
$wait_for_non_whitespace = 0;
$parenthesis_count = 0;
$entry = "";
// main loop
while (++$i < count($token)) {
// don't match commas in func($a, $b)
if ($token[$i] == "(" || $token[$i] == "{") // ( -> normal parenthesis; { -> closures
$parenthesis_count++;
if ($token[$i] == ")" || $token[$i] == "}")
$parenthesis_count--;
// begin new string after T_DOUBLE_ARROW
if (!$escape && $wait_for_non_whitespace && (!is_array($token[$i]) || $token[$i][0] != T_WHITESPACE)) {
$escape = 1;
$wait_for_non_whitespace = 0;
$entry .= "'";
}
// here is a T_DOUBLE_ARROW, there will be a string after this
if (is_array($token[$i]) && $token[$i][0] == T_DOUBLE_ARROW && !$escape) {
$wait_for_non_whitespace = 1;
}
// entry ended: comma reached
if (!$parenthesis_count && $token[$i] == "," || ($parenthesis_count == -1 && $token[$i] == ")" && $ending == ")") || ($ending == "]" && $token[$i] == "]")) {
// go back to the first non-whitespace
$whitespaces = "";
if ($parenthesis_count == -1 || ($ending == "]" && $token[$i] == "]")) {
$cut_at = strlen($entry);
while ($cut_at && ord($entry[--$cut_at]) <= 0x20); // 0x20 == " "
$whitespaces = substr($entry, $cut_at + 1, strlen($entry));
$entry = substr($entry, 0, $cut_at + 1);
}
// $escape == true means: there was somewhere a T_DOUBLE_ARROW
if ($escape) {
$escape = 0;
$newcode .= $entry."'";
} else {
$newcode .= "'".addcslashes($entry, "'\\")."'";
}
$newcode .= $whitespaces.($parenthesis_count?")":(($ending == "]" && $token[$i] == "]")?"]":","));
// reset
$entry = "";
} else {
// add actual token to $entry
if (is_array($token[$i])) {
$addChar = $token[$i][1];
} else {
$addChar = $token[$i];
}
if ($entry == "" && $token[$i][0] == T_WHITESPACE) {
$newcode .= $addChar;
} else {
$entry .= $escape?str_replace(array("'", "\\"), array("\\'", "\\\\"), $addChar):$addChar;
}
}
}
//append remaining chars like whitespaces or ;
$newcode .= $entry;
print $newcode;
Demo at: http://3v4l.org/qe4Q1
Should output:
array(
0 => '"a"',
"a" => '$GlobalScopeVar',
"b" => 'array("nested"=>array(1,2,3))',
"c" => 'function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }',
'"string_literal"',
'12345'
)
You can, to get the array's data, print_r(eval("return $newcode;"));
to get the entries of the array:
Array
(
[0] => "a"
[a] => $GlobalScopeVar
[b] => array("nested"=>array(1,2,3))
[c] => function() use (&$VAR) { return isset($VAR) ? "defined" : "undefined" ; }
[1] => "string_literal"
[2] => 12345
)
这篇关于部分提取php代码的正则表达式((数组定义))的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!