将结构转换为PHP数组 [英] Convert structure to PHP array

查看:239
本文介绍了将结构转换为PHP数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们公司使用的可怕系统为我提供了以下输出:

{
    party:"bases",
    number:"1",
    id:"xx_3039366",
    url:"systen01-ny.com",
    target:"_self",
    address:"Ch\u00e3o as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA",
    coordinate:{
        x:90.995262145996094,
        y:-1.3394836426
    },
    contactDetails:{
        id:"366",
        phone:"xxxxxx",
        mobile:"",
        fax:"xxxx 777 235",
        c2c:!0
    },
    parameters:"Flex Am\u00e1vel Silva,hal,,EN_30336,S,786657,1,0,",
    text:"Vila Nova de Loz C\u00f4a,os melhores vinhos, v\u00e1rias. Produtor/exportador/com\u00e9rcio",
    website:null,
    mail:"",
    listing:"paid",
    pCode:"64",
    name:"xpto Am\u00e1vel Costa",
    logo:{src:"http://ny.test.gif",
    altname:"xpto Am\u00e1vel Costa"},
    bookingUrl:"",
    ipUrl:"",
    ipLabel:"",
    customerId:"7657",
    addressId:"98760",
    combined:null,
    showReviews:!0
}

我想知道是否有一种方法可以将输出转换为数组,就好像它是一个json一样,或者甚至是某种其他格式,我都可以在PHP中操作此数据. Json_decode不起作用.

解决方案

就像我说的那样,这里是您自己的Json Object解析器.

一个警告,这些事情可能比艺术更像是科学,因此,如果您的输入与示例中的内容有所不同,则可能会出现问题.鉴于样本量很小(1个文档),因此我无法保证一个示例之外的功能.

我会尝试解释它是如何工作的,但我担心它会在凡人身上丢失.

说真的,这很有趣,一次享受了挑战.

<?php
function parseJson($subject, $tokens)
{
    $types = array_keys($tokens);
    $patterns = [];
    $lexer_stream = [];

    $result = false;

    foreach ($tokens as $k=>$v){
        $patterns[] = "(?P<$k>$v)";      
    } 
    $pattern = "/".implode('|', $patterns)."/i";

    if (preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE)) {
        //print_r($matches);
        foreach ($matches[0] as $key => $value) {
            $match = [];
            foreach ($types as $type) {
                $match = $matches[$type][$key];
                if (is_array($match) && $match[1] != -1) {
                    break;
                }
            }

            $tok  = [
                'content' => $match[0],
                'type' => $type,
                'offset' => $match[1]
            ];

            $lexer_stream[] = $tok;       
        }

       $result = parseJsonTokens( $lexer_stream );
    }
    return $result;
} 

function parseJsonTokens( array &$lexer_stream ){

    $result = [];

    next($lexer_stream); //advnace one
    $mode = 'key'; //items start in key mode  ( key => value )

    $key = '';
    $value = '';

    while($current = current($lexer_stream)){
        $content = $current['content'];
        $type = $current['type'];

        switch($type){
            case 'T_WHITESPACE'://ignore whitespace
                next($lexer_stream);
            break;
            case 'T_STRING':
                //keys are always strings, but strings are not always keys
                if( $mode == 'key')
                    $key .= $content;
                else
                    $value .= $content;           
                next($lexer_stream); //consume a token
            break;
            case 'T_COLON':
                $mode = 'value'; //change mode key :
                next($lexer_stream);//consume a token
            break;
            case 'T_ENCAP_STRING':
                $value .= trim(unicode_decode($content),'"'); //encapsulated strings are always content
                next($lexer_stream);//consume a token
            break;   
            case 'T_NULL':
                 $value = null; //encapsulated strings are always content
                 next($lexer_stream);//consume a token
            break;          
            case 'T_COMMA':  //comma ends an item               
                //store
                $result[$key] = $value;
                //reset
                $mode = 'key'; //items start in key mode  ( key => value ) 
                $key = '';
                $value = ''; 
                next($lexer_stream);//consume a token
            break;
            case 'T_OPEN_BRACE': //start of a sub-block
                $value = parseJsonTokens($lexer_stream); //recursive
            break;
            case 'T_CLOSE_BRACE': //start of a sub-block
                //store
                $result[$key] = $value;
                next($lexer_stream);//consume a token
                return $result;
            break;
            default:
                print_r($current);
                trigger_error("Unknown token $type value $content", E_USER_ERROR);
        }

    }

    if( !$current ) return;   
    print_r($current);
    trigger_error("Unclosed item $mode for $type value $content", E_USER_ERROR);
}

//@see https://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-cha
function replace_unicode_escape_sequence($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}

function unicode_decode($str) {
    return preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);
}

$str = '{
    party:"bases",
    number:"1",
    id:"xx_3039366",
    url:"systen01-ny.com",
    target:"_self",
    address:"Ch\u00e3o as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA",
    coordinate:{
        x:90.995262145996094,
        y:-1.3394836426
    },
    contactDetails:{
        id:"366",
        phone:"xxxxxx",
        mobile:"",
        fax:"xxxx 777 235",
        c2c:!0
    },
    parameters:"Flex Am\u00e1vel Silva,hal,,EN_30336,S,786657,1,0,",
    text:"Vila Nova de Loz C\u00f4a,os melhores vinhos, v\u00e1rias. Produtor/exportador/com\u00e9rcio",
    website:null,
    mail:"",
    listing:"paid",
    pCode:"64",
    name:"xpto Am\u00e1vel Costa",
    logo:{src:"http://ny.test.gif",
    altname:"xpto Am\u00e1vel Costa"},
    bookingUrl:"",
    ipUrl:"",
    ipLabel:"",
    customerId:"7657",
    addressId:"98760",
    combined:null,
    showReviews:!0
}';

$tokens = [
    'T_OPEN_BRACE'      => '\{',
    'T_CLOSE_BRACE'     => '\}',
    'T_NULL'            => '\bnull\b',
    'T_ENCAP_STRING'    => '\".*?(?<!\\\\)\"',
    'T_COLON'           => ':',
    'T_COMMA'           => ',',
    'T_STRING'          => '[-a-z0-9_.!]+',
    'T_WHITESPACE'      => '[\r\n\s\t]+',
    'T_UNKNOWN'         => '.+?'
];

var_export( parseJson($str, $tokens) );

输出(这是每个人都想要的)

array (
  'party' => 'bases',
  'number' => '1',
  'id' => 'xx_3039366',
  'url' => 'systen01-ny.com',
  'target' => '_self',
  'address' => 'Chão as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA',
  'coordinate' => 
  array (
    'x' => '90.995262145996094',
    'y' => '-1.3394836426',
  ),
  'contactDetails' => 
  array (
    'id' => '366',
    'phone' => 'xxxxxx',
    'mobile' => '',
    'fax' => 'xxxx 777 235',
    'c2c' => '!0',
  ),
  'parameters' => 'Flex Amável Silva,hal,,EN_30336,S,786657,1,0,',
  'text' => 'Vila Nova de Loz Côa,os melhores vinhos, várias. Produtor/exportador/comércio',
  'website' => NULL,
  'mail' => '',
  'listing' => 'paid',
  'pCode' => '64',
  'name' => 'xpto Amável Costa',
  'logo' => 
  array (
    'src' => 'http://ny.test.gif',
    'altname' => 'xpto Amável Costa',
  ),
  'bookingUrl' => '',
  'ipUrl' => '',
  'ipLabel' => '',
  'customerId' => '7657',
  'addressId' => '98760',
  'combined' => NULL,
  'showReviews' => '!0',
)

您甚至可以在这里进行测试(因为我是个好人)

http://sandbox.onlinephpfunctions.com/code/3c1dcafb59abbc19f7f3209724a465

借助此SO帖子,我能够解决\u00e等编码问题,因此大声疾呼,因为我讨厌字符编码.

http ://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-char

伙计,我只是喜欢一段漂亮的代码,只是umm.

干杯!

The horrible system we use in my company gives me the following output:

{
    party:"bases",
    number:"1",
    id:"xx_3039366",
    url:"systen01-ny.com",
    target:"_self",
    address:"Ch\u00e3o as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA",
    coordinate:{
        x:90.995262145996094,
        y:-1.3394836426
    },
    contactDetails:{
        id:"366",
        phone:"xxxxxx",
        mobile:"",
        fax:"xxxx 777 235",
        c2c:!0
    },
    parameters:"Flex Am\u00e1vel Silva,hal,,EN_30336,S,786657,1,0,",
    text:"Vila Nova de Loz C\u00f4a,os melhores vinhos, v\u00e1rias. Produtor/exportador/com\u00e9rcio",
    website:null,
    mail:"",
    listing:"paid",
    pCode:"64",
    name:"xpto Am\u00e1vel Costa",
    logo:{src:"http://ny.test.gif",
    altname:"xpto Am\u00e1vel Costa"},
    bookingUrl:"",
    ipUrl:"",
    ipLabel:"",
    customerId:"7657",
    addressId:"98760",
    combined:null,
    showReviews:!0
}

I would like to know if there is a way to convert the output to array, as if it were a json, or even some other format that I can manipulate this data in PHP. Json_decode does not work.

解决方案

Just like I said I would, here your very own Json Object parser.

One word of warning, these kind of things can be more art then science so if your inputs vary from what was in your example, it could have issues. Given the small sample size (1 document) I make no guarantees on it's functionality outside that one example.

I would try to explain how this works, but I fear it would be lost on mere mortals.

Seriously, this was fun, enjoyed the challenge for once.

<?php
function parseJson($subject, $tokens)
{
    $types = array_keys($tokens);
    $patterns = [];
    $lexer_stream = [];

    $result = false;

    foreach ($tokens as $k=>$v){
        $patterns[] = "(?P<$k>$v)";      
    } 
    $pattern = "/".implode('|', $patterns)."/i";

    if (preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE)) {
        //print_r($matches);
        foreach ($matches[0] as $key => $value) {
            $match = [];
            foreach ($types as $type) {
                $match = $matches[$type][$key];
                if (is_array($match) && $match[1] != -1) {
                    break;
                }
            }

            $tok  = [
                'content' => $match[0],
                'type' => $type,
                'offset' => $match[1]
            ];

            $lexer_stream[] = $tok;       
        }

       $result = parseJsonTokens( $lexer_stream );
    }
    return $result;
} 

function parseJsonTokens( array &$lexer_stream ){

    $result = [];

    next($lexer_stream); //advnace one
    $mode = 'key'; //items start in key mode  ( key => value )

    $key = '';
    $value = '';

    while($current = current($lexer_stream)){
        $content = $current['content'];
        $type = $current['type'];

        switch($type){
            case 'T_WHITESPACE'://ignore whitespace
                next($lexer_stream);
            break;
            case 'T_STRING':
                //keys are always strings, but strings are not always keys
                if( $mode == 'key')
                    $key .= $content;
                else
                    $value .= $content;           
                next($lexer_stream); //consume a token
            break;
            case 'T_COLON':
                $mode = 'value'; //change mode key :
                next($lexer_stream);//consume a token
            break;
            case 'T_ENCAP_STRING':
                $value .= trim(unicode_decode($content),'"'); //encapsulated strings are always content
                next($lexer_stream);//consume a token
            break;   
            case 'T_NULL':
                 $value = null; //encapsulated strings are always content
                 next($lexer_stream);//consume a token
            break;          
            case 'T_COMMA':  //comma ends an item               
                //store
                $result[$key] = $value;
                //reset
                $mode = 'key'; //items start in key mode  ( key => value ) 
                $key = '';
                $value = ''; 
                next($lexer_stream);//consume a token
            break;
            case 'T_OPEN_BRACE': //start of a sub-block
                $value = parseJsonTokens($lexer_stream); //recursive
            break;
            case 'T_CLOSE_BRACE': //start of a sub-block
                //store
                $result[$key] = $value;
                next($lexer_stream);//consume a token
                return $result;
            break;
            default:
                print_r($current);
                trigger_error("Unknown token $type value $content", E_USER_ERROR);
        }

    }

    if( !$current ) return;   
    print_r($current);
    trigger_error("Unclosed item $mode for $type value $content", E_USER_ERROR);
}

//@see https://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-cha
function replace_unicode_escape_sequence($match) {
    return mb_convert_encoding(pack('H*', $match[1]), 'UTF-8', 'UCS-2BE');
}

function unicode_decode($str) {
    return preg_replace_callback('/\\\\u([0-9a-f]{4})/i', 'replace_unicode_escape_sequence', $str);
}

$str = '{
    party:"bases",
    number:"1",
    id:"xx_3039366",
    url:"systen01-ny.com",
    target:"_self",
    address:"Ch\u00e3o as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA",
    coordinate:{
        x:90.995262145996094,
        y:-1.3394836426
    },
    contactDetails:{
        id:"366",
        phone:"xxxxxx",
        mobile:"",
        fax:"xxxx 777 235",
        c2c:!0
    },
    parameters:"Flex Am\u00e1vel Silva,hal,,EN_30336,S,786657,1,0,",
    text:"Vila Nova de Loz C\u00f4a,os melhores vinhos, v\u00e1rias. Produtor/exportador/com\u00e9rcio",
    website:null,
    mail:"",
    listing:"paid",
    pCode:"64",
    name:"xpto Am\u00e1vel Costa",
    logo:{src:"http://ny.test.gif",
    altname:"xpto Am\u00e1vel Costa"},
    bookingUrl:"",
    ipUrl:"",
    ipLabel:"",
    customerId:"7657",
    addressId:"98760",
    combined:null,
    showReviews:!0
}';

$tokens = [
    'T_OPEN_BRACE'      => '\{',
    'T_CLOSE_BRACE'     => '\}',
    'T_NULL'            => '\bnull\b',
    'T_ENCAP_STRING'    => '\".*?(?<!\\\\)\"',
    'T_COLON'           => ':',
    'T_COMMA'           => ',',
    'T_STRING'          => '[-a-z0-9_.!]+',
    'T_WHITESPACE'      => '[\r\n\s\t]+',
    'T_UNKNOWN'         => '.+?'
];

var_export( parseJson($str, $tokens) );

Outputs ( this is what everyone wants )

array (
  'party' => 'bases',
  'number' => '1',
  'id' => 'xx_3039366',
  'url' => 'systen01-ny.com',
  'target' => '_self',
  'address' => 'Chão as Alminhas-Medas,Uteiros de Gatos e Fontes Longaq<br/>64320-761 ANHADOS LdA',
  'coordinate' => 
  array (
    'x' => '90.995262145996094',
    'y' => '-1.3394836426',
  ),
  'contactDetails' => 
  array (
    'id' => '366',
    'phone' => 'xxxxxx',
    'mobile' => '',
    'fax' => 'xxxx 777 235',
    'c2c' => '!0',
  ),
  'parameters' => 'Flex Amável Silva,hal,,EN_30336,S,786657,1,0,',
  'text' => 'Vila Nova de Loz Côa,os melhores vinhos, várias. Produtor/exportador/comércio',
  'website' => NULL,
  'mail' => '',
  'listing' => 'paid',
  'pCode' => '64',
  'name' => 'xpto Amável Costa',
  'logo' => 
  array (
    'src' => 'http://ny.test.gif',
    'altname' => 'xpto Amável Costa',
  ),
  'bookingUrl' => '',
  'ipUrl' => '',
  'ipLabel' => '',
  'customerId' => '7657',
  'addressId' => '98760',
  'combined' => NULL,
  'showReviews' => '!0',
)

And you can even test it here ( because I am a nice guy )

http://sandbox.onlinephpfunctions.com/code/3c1dcafb59abbf19f7f3209724dbdd4a46546c57

I was able to fix the encoding issues \u00e etc with help of this SO post, so a shout out to them, because I hate character encoding.

http://stackoverflow.com/questions/2934563/how-to-decode-unicode-escape-sequences-like-u00ed-to-proper-utf-8-encoded-char

Man I just love a beautiful piece of code, just umm.

Cheers!

这篇关于将结构转换为PHP数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆