如何使用CDATA XML文件解析在PHP关联数组 [英] how to parse xml file with CDATA to an associative array in PHP

查看:111
本文介绍了如何使用CDATA XML文件解析在PHP关联数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所有,

我有一个inputXml.xml文件,如下:

 <内容>
<项目名称=书的标签=书>
<![CDATA [书籍名称]>
< /项目>
<项目名称=价格标签=价格>
&所述;![CDATA [35]]≥
< /项目>
< /内容>

当我使用code如下解析xml文件:

  $ OBJ = simplexml_load_string(的file_get_contents($ inputXml)的SimpleXMLElement',LIBXML_NOCDATA);
$ JSON = json_en code($ OBJ);
$ inputArray = json_de code($ JSON,TRUE);

我得到的数组象下面这样:

  [内容] =>排列
            (
                [项目] =>排列
                    (
                        [0] =>书名
                        [1] => 35
                    )            )

我想知道,是否有可能通过使用属性名或标签为重点,下面的值来得到一个关联数组:

  [内容] =>排列
            (
                [项目] =>排列
                    (
                        [名] =>书名
                        [价格] => 35
                    )            )


解决方案

所有你已经通过其他code,你需要 json_en code json_de code 来的获取数组出的的SimpleXMLElement 。相反,你只需要转换为数组:

  $ inputArray =(阵列)$ OBJ;

然后你有问题,你要寻找的数组序列化是不是默认的序列化的的SimpleXMLElement 与XML提供。

此外,你还有一个小问题是关于使用 LIBXML_NOCDATA ,否则你不会得到走近格式的依赖。但不依赖于该标志(并因此在该点,如果底层XM​​L将使用CDATA与否对XML编码元素值)将是有用的,也获得了code的一定的稳定性。

由于的SimpleXMLElement 不提供您有正常两个选项通缉的行为:从的SimpleXMLElement 扩展或装饰。我通常建议装修的延伸是有限的。例如。你无法通过扩展与(阵列)铸造干扰,但是,您可以为JSON序列化。但是,这不是你要找什么,你要找的数组序列。

所以对于一种标准的阵列序列化的的SimpleXMLElement 您可以与如何的串行的和的策略的对象实现这个数组序列化的特定元素。

这首先需要的串行的:

 接口ArraySerializer
{
    公共职能arraySerialize();
}类SimpleXMLArraySerializer实现ArraySerializer
{
    / **
     * @var的SimpleXMLElement
     * /
    私人$主题;    / **
     * @var SimpleXMLArraySerializeStrategy
     * /
    私人$策略;    公共职能__construct($的SimpleXMLElement元素,SimpleXMLArraySerializeStrategy $策略= NULL){
        $这个 - >主题= $元素;
        $这个 - >策略= $策略:新DefaultSimpleXMLArraySerializeStrategy();?
    }    公共职能arraySerialize(){
        $策略= $这个 - > getStrategy();
        返回$策略 - >序列化($这个 - >一级学科);
    }    / **
     * @返回SimpleXMLArraySerializeStrategy
     * /
    公共职能getStrategy(){
        返回$这个 - >战略;
    }
}

本阵串器目前还不缺少序列化功能。这已被引导到的策略,以便它可以容易地交换以后。这里是一个默认的策略这样做的:

 抽象类SimpleXMLArraySerializeStrategy
{
    抽象的公共职能序列化($的SimpleXMLElement元素);
}类DefaultSimpleXMLArraySerializeStrategy扩展SimpleXMLArraySerializeStrategy
{
    公共职能序列化($的SimpleXMLElement元素){
        $阵列=阵列();        //创建如有子元素的数组。对重复的名称作为一个阵列组。
        的foreach($元素作为$名=> $子){
            如果(使用isset($数组[$名称])){
                如果(!is_array($数组[$名称])){
                    $数组[$名称] = [$阵列[$名称]];
                }
                $数组[$名称] [] = $这个 - >连载($子女);
            }其他{
                $数组[$名称] = $这个 - >序列化($子女);
            }
        }        //处理的SimpleXMLElement文本值。
        如果(!$数组){
            $阵列=(字符串)$元素;
        }        //返回空元素为NULL(自行关闭或空标签)
        如果(!$数组){
            $阵列= NULL;
        }        返回$阵列;
    }
}

这个对象包含一个的SimpleXMLElement 转换为数组的一种常见方式。它的行为相当于你的XML作为的SimpleXMLElement LIBXML_NOCDATA 已经这样做了。然而,它不具有与CDATA中的问题。为了说明这一点,下面的例子已经给你的输出:

  $ OBJ =新的SimpleXMLElement($ XML);
$串行=新SimpleXMLArraySerializer($ OBJ);
的print_r($ serializer-> arraySerialize());

现在作为迄今为止数组序列已经在类型它自己的实现,很容易根据需要去改变它。对于的内容的元素你有不同的策略,把它变成一个数组。它也容易得多:

 类ContentXMLArraySerializeStrategy扩展SimpleXMLArraySerializeStrategy
{
    公共职能序列化($的SimpleXMLElement元素){
        $阵列=阵列();        的foreach($元素 - >项目作为$项目){
            $阵列[(字符串)$项目['名'] =(字符串)$项目;
        }        返回数组('项目'=> $阵列);
    }
}

剩下的就是接线到 SimpleXMLArraySerializer 此右边的条件。例如。取决于元素的名称

  ...    / **
     * @返回SimpleXMLArraySerializeStrategy
     * /
    公共职能getStrategy(){
        如果($这个 - >标的>的getName()==='内容'){
            返回新ContentXMLArraySerializeStrategy();
        }        返回$这个 - >战略;
    }
}

现在上面同样的例子:

  $ OBJ =新的SimpleXMLElement($ XML);
$串行=新SimpleXMLArraySerializer($ OBJ);
的print_r($ serializer-> arraySerialize());

会给你想要的输出(美化):

 阵列

    [项目] =>排列
        (
            [图书] =>书名
            [价格] => 35
        )

随着你的XML可能只有这一个元素,我会说抽象这样的水平可能会有点多。但是,如果XML会改变,你必须在同一文档中居然多阵列格式的需求,这是一个似是而非的路要走。

我在我的例子中使用的默认序列化是基于的 的SimpleXML和JSON恩code在PHP - 第三部分和结束

all,

I have a inputXml.xml file as below:

<content>
<item  name="book" label="Book">
<![CDATA[ book name ]]>
</item>
<item  name="price" label="Price">
<![CDATA[ 35 ]]>
</item>
</content>

And when I use code as below to parse the xml file:

$obj = simplexml_load_string(file_get_contents($inputXml),'SimpleXMLElement', LIBXML_NOCDATA);
$json = json_encode($obj);
$inputArray = json_decode($json,TRUE);

I get the array like below:

[content] => Array
            (
                [item] => Array
                    (
                        [0] => book name
                        [1] => 35
                    )

            )

I am wondering, is it possible to get an associative array by using the value of the attributes "name" or "label" as the key as below:

[content] => Array
            (
                [item] => Array
                    (
                        [name] => book name
                        [price] => 35
                    )

            )

解决方案

First of all you've been fooled by some other code that you would need to json_encode and json_decode to the get the array out of SimpleXMLElement. Instead, you only need to cast to array:

$inputArray = (array) $obj;

Then you've got the problem that the array-serialization you're looking for is not the default serialization that the SimpleXMLElement provides with that XML.

Additionally another minor problem you have is the dependency on using LIBXML_NOCDATA because otherwise you wouldn't get the format to come near. But not depending on that flag (and therefore on the point if the underlying XML would use CDATA or not for element value XML-encoding) would be useful, too, to gain a certain stability of the code.

As SimpleXMLElement does not provide your wanted behavior you have normally two options here: Extend from SimpleXMLElement or decorate it. I normally suggest decoration as extension is limited. E.g. you can not interfere via extension with the (array) casting, you can however for JSON serialization. But that's not what you're looking for, you're looking for array serialization.

So for a kind-of-standard array serialization of a SimpleXMLElement you could implement this with a serializer and a strategy object on how to array-serialize a specific element.

This first needs the serializer:

interface ArraySerializer
{
    public function arraySerialize();
}

class SimpleXMLArraySerializer implements ArraySerializer
{
    /**
     * @var SimpleXMLElement
     */
    private $subject;

    /**
     * @var SimpleXMLArraySerializeStrategy
     */
    private $strategy;

    public function __construct(SimpleXMLElement $element, SimpleXMLArraySerializeStrategy $strategy = NULL) {
        $this->subject  = $element;
        $this->strategy = $strategy ?: new DefaultSimpleXMLArraySerializeStrategy();
    }

    public function arraySerialize() {
        $strategy = $this->getStrategy();
        return $strategy->serialize($this->subject);
    }

    /**
     * @return SimpleXMLArraySerializeStrategy
     */
    public function getStrategy() {
        return $this->strategy;
    }
}

This array-serializer is yet missing the functionality to serialize. This has been directed to a strategy so that it can be easily exchanged later on. Here is a default strategy to do so:

abstract class SimpleXMLArraySerializeStrategy
{
    abstract public function serialize(SimpleXMLElement $element);
}

class DefaultSimpleXMLArraySerializeStrategy extends SimpleXMLArraySerializeStrategy
{
    public function serialize(SimpleXMLElement $element) {
        $array = array();

        // create array of child elements if any. group on duplicate names as an array.
        foreach ($element as $name => $child) {
            if (isset($array[$name])) {
                if (!is_array($array[$name])) {
                    $array[$name] = [$array[$name]];
                }
                $array[$name][] = $this->serialize($child);
            } else {
                $array[$name] = $this->serialize($child);
            }
        }

        // handle SimpleXMLElement text values.
        if (!$array) {
            $array = (string)$element;
        }

        // return empty elements as NULL (self-closing or empty tags)
        if (!$array) {
            $array = NULL;
        }

        return $array;
    }
}

This object contains a common way to convert a SimpleXMLElement into an array. It behaves comparable to what your XML as SimpleXMLElement with LIBXML_NOCDATA already does. However it does not have the problem with CDATA. To show this, the following example already gives the output you have:

$obj        = new SimpleXMLElement($xml);
$serializer = new SimpleXMLArraySerializer($obj);
print_r($serializer->arraySerialize());

Now as so far the array serialization has been implemented in types of it's own, it's easy to change it according to the needs. For the content element you have a different strategy to turn it into an array. It is also far easier:

class ContentXMLArraySerializeStrategy extends SimpleXMLArraySerializeStrategy
{
    public function serialize(SimpleXMLElement $element) {
        $array = array();

        foreach ($element->item as $item) {
            $array[(string) $item['name']] = (string) $item;
        }

        return array('item' => $array);
    }
}

What's left is to wire this into the SimpleXMLArraySerializer on the right condition. E.g. depending on the name of the element:

...

    /**
     * @return SimpleXMLArraySerializeStrategy
     */
    public function getStrategy() {
        if ($this->subject->getName() === 'content') {
            return new ContentXMLArraySerializeStrategy();
        }

        return $this->strategy;
    }
}

Now the same example from above:

$obj        = new SimpleXMLElement($xml);
$serializer = new SimpleXMLArraySerializer($obj);
print_r($serializer->arraySerialize());

would give you the wanted output (beautified):

Array
(
    [item] => Array
        (
            [book]  => book name
            [price] => 35 
        )
)

As your XML probably only have this one element, I'd say such a level of abstraction might be a little much. However, if the XML is going to change and you have actually multiple array format needs within the same document, this is a plausible way to go.

The default serialization I've used in my example is based on the decoration example in SimpleXML and JSON Encode in PHP – Part III and End.

这篇关于如何使用CDATA XML文件解析在PHP关联数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆