解析带有递归括号的字符串 [英] Parsing a string with recursive parentheses
问题描述
我正在尝试在PHP中解析具有以下结构的字符串:
I'm trying to parse a string with the following structure in PHP:
a,b,c(d,e,f(g),h,i(j,k)),l,m,n(o),p
例如,真实"字符串为:
For example, a "real" string will be:
id,topic,member(name,email,group(id,name)),message(id,title,body)
我的最终结果应该是一个数组:
My end result should be an array:
[
id => null,
topic => null
member => [
name => null,
email => null,
group => [
id => null,
name => null
]
],
message => [
id => null,
title => null,
body => null
]
]
我尝试了递归正则表达式,但完全迷路了. 我在迭代字符串字符方面取得了一些成功,但似乎有点过于复杂",而且我敢肯定这是正则表达式可以处理的事情,我只是不知道该怎么做.
I've tried recursive regex, but got totally lost. I've got some success with iterating over the string characters, but that seem a bit "over complicated" and I'm sure that is something a regex can handle, I just don't know how.
目的是为REST API解析字段查询参数,以允许客户端从复杂的对象集合中选择他想要的字段,而我不想限制字段选择的深度".
The purpose is to parse a fields query parameter for a REST API, to allow the client to select the fields he wants from a complex object collection, and I don't want to limit the "depth" of the field selection.
推荐答案
正如Wiktor所指出的,这可以在词法分析器的帮助下实现.以下答案使用了最初来自Nikita Popopv的类,可以在此处.
As Wiktor pointed out, this can be achieved with the help of a lexer. The following answer uses a class originally from Nikita Popopv, which can be found here.
它跳过字符串并搜索$tokenMap
中定义的匹配项.这些定义为T_FIELD
,T_SEPARATOR
,T_OPEN
和T_CLOSE
.找到的值放在名为$structure
的数组中.
之后,我们需要遍历此数组并从中构建结构.由于可以有多个嵌套,因此我选择了一种递归方法(generate()
).
It skims through the string and searches for matches as defined in the $tokenMap
. These are defined as T_FIELD
, T_SEPARATOR
, T_OPEN
and T_CLOSE
. The values found are put in an array called $structure
.
Afterwards we need to loop over this array and build the structure out of it. As there can be multiple nestings, I chose a recursive approach (generate()
).
可以在ideone.com上找到演示.
A demo can be found on ideone.com.
带有说明的实际代码:
// this is our $tokenMap
$tokenMap = array(
'[^,()]+' => T_FIELD, # not comma or parentheses
',' => T_SEPARATOR, # a comma
'\(' => T_OPEN, # an opening parenthesis
'\)' => T_CLOSE # a closing parenthesis
);
// this is your string
$string = "id,topic,member(name,email,group(id,name)),message(id,title,body)";
// a recursive function to actually build the structure
function generate($arr=array(), $idx=0) {
$output = array();
$current = null;
for($i=$idx;$i<count($arr);$i++) {
list($element, $type) = $arr[$i];
if ($type == T_OPEN)
$output[$current] = generate($arr, $i+1);
elseif ($type == T_CLOSE)
return $output;
elseif ($type == T_FIELD) {
$output[$element] = null;
$current = $element;
}
}
return $output;
}
$lex = new Lexer($tokenMap);
$structure = $lex->lex($string);
print_r(generate($structure));
这篇关于解析带有递归括号的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!