解析带有递归括号的字符串 [英] Parsing a string with recursive parentheses

查看:306
本文介绍了解析带有递归括号的字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在PHP中解析具有以下结构的字符串:

I'm trying to parse a string with the following structure in PHP:

a,b,c(d,e,f(g),h,i(j,k)),l,m,n(o),p

例如,真实"字符串为:

For example, a "real" string will be:

id,topic,member(name,email,group(id,name)),message(id,title,body)

我的最终结果应该是一个数组:

My end result should be an array:

[
   id => null,
   topic => null
   member => [
      name => null,
      email => null,
      group => [
         id => null,
         name => null
      ]
   ],
   message => [
      id => null,
      title => null,
      body => null
  ]
]

我尝试了递归正则表达式,但完全迷路了. 我在迭代字符串字符方面取得了一些成功,但似乎有点过于复杂",而且我敢肯定这是正则表达式可以处理的事情,我只是不知道该怎么做.

I've tried recursive regex, but got totally lost. I've got some success with iterating over the string characters, but that seem a bit "over complicated" and I'm sure that is something a regex can handle, I just don't know how.

目的是为REST API解析字段查询参数,以允许客户端从复杂的对象集合中选择他想要的字段,而我不想限制字段选择的深度".

The purpose is to parse a fields query parameter for a REST API, to allow the client to select the fields he wants from a complex object collection, and I don't want to limit the "depth" of the field selection.

推荐答案

正如Wiktor所指出的,这可以在词法分析器的帮助下实现.以下答案使用了最初来自Nikita Popopv的类,可以在

As Wiktor pointed out, this can be achieved with the help of a lexer. The following answer uses a class originally from Nikita Popopv, which can be found here.

它跳过字符串并搜索$tokenMap中定义的匹配项.这些定义为T_FIELDT_SEPARATORT_OPENT_CLOSE.找到的值放在名为$structure的数组中. 之后,我们需要遍历此数组并从中构建结构.由于可以有多个嵌套,因此我选择了一种递归方法(generate()).

It skims through the string and searches for matches as defined in the $tokenMap. These are defined as T_FIELD, T_SEPARATOR, T_OPEN and T_CLOSE. The values found are put in an array called $structure.
Afterwards we need to loop over this array and build the structure out of it. As there can be multiple nestings, I chose a recursive approach (generate()).

可以在ideone.com上找到演示.

A demo can be found on ideone.com.

带有说明的实际代码:

// this is our $tokenMap
$tokenMap = array(
    '[^,()]+'       => T_FIELD,     # not comma or parentheses
    ','             => T_SEPARATOR, # a comma
    '\('            => T_OPEN,      # an opening parenthesis
    '\)'            => T_CLOSE      # a closing parenthesis
);

// this is your string
$string = "id,topic,member(name,email,group(id,name)),message(id,title,body)";

// a recursive function to actually build the structure
function generate($arr=array(), $idx=0) {
    $output = array();
    $current = null;
    for($i=$idx;$i<count($arr);$i++) {
        list($element, $type) = $arr[$i];
        if ($type == T_OPEN)
            $output[$current] = generate($arr, $i+1);
        elseif ($type == T_CLOSE)
            return $output;
        elseif ($type == T_FIELD) {
            $output[$element] = null;
            $current = $element;
        }
    }
    return $output;
}

$lex = new Lexer($tokenMap);
$structure = $lex->lex($string);

print_r(generate($structure));

这篇关于解析带有递归括号的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆