excel如何读取XML文件? [英] How excel reads XML file?

查看:101
本文介绍了excel如何读取XML文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经研究了很多将xml文件转换为2d数组的方式,excel在excel中打开一个xml文件时尝试使用与excel相同的算法。

I have researched a lot to convert an xml file to 2d array in a same way excel does trying to make same algorithm as excel does when you open an xml file in excel.

<items>
    <item>
        <sku>abc 1</sku>
        <title>a book 1</title>
        <price>42 1</price>
        <attributes>
            <attribute>
                <name>Number of pages 1</name>
                <value>123 1</value>
            </attribute>
            <attribute>
                <name>Author 1</name>
                <value>Rob dude 1</value>
            </attribute>
        </attributes>
        <contributors>
            <contributor>John 1</contributor>
            <contributor>Ryan 1</contributor>
        </contributors>
        <isbn>12345</isbn>
    </item>
    <item>
        <sku>abc 2</sku>
        <title>a book 2</title>
        <price>42 2</price>
        <attributes>
            <attribute>
                <name>Number of pages 2</name>
                <value>123 2</value>
            </attribute>
            <attribute>
                <name>Author 2</name>
                <value>Rob dude 2</value>
            </attribute>
        </attributes>
        <contributors>
            <contributor>John 2</contributor>
            <contributor>Ryan 2</contributor>
        </contributors>
        <isbn>6789</isbn>
     </item>
</items>

我想要将其转换为二维数组如果您在Excel中打开相同的文件,它将显示如下

I want it to convert it to to 2-dimensional array like if you open the same file in Excel it will show you like this

我想像Excel一样转换为二维数组。到目前为止,我可以提取像Excel这样的标签

I want to convert to 2-dimensional array just like Excel does. So far I can extract the labels like Excel does

function getColNames($array) {
    $cols   = array();
    foreach($array as $key=>$val) {
        if(is_array($val)) {
            if($val['type']=='complete') {
                if(in_array($val['tag'], $cols)) {

                } else {
                    $cols[] = $val['tag'];
                }
            }
         }
    }
    return $cols;
}

$p = xml_parser_create();
xml_parse_into_struct($p, $simple, $vals, $index);
xml_parser_free($p);





目标



我想让它生成这样..

Goal

I want to have it generate like this..

array (
    0 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'name'=>'Number of Pages 1',
        'value'=>'123 1',
        'isbn'=>12345
    ),
    1 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'name'=>'Author 1',
        'value'=>'Rob dude 1',
        'isbn'=>12345
    ),
    2 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'contributor'=>'John 1',
        'isbn'=>12345
    ),
    3 => array (
        'sku'=>'abc 1',
        'title'=>'a book 1',
        'price'=>'42 1',
        'contributor'=>'Ryan 1',
        'isbn'=>12345
    ),
)

示例2 XML ..

Sample 2 XML..

 <items>
    <item>
       <sku>abc 1</sku>
       <title>a book 1</title>
       <price>42 1</price>
       <attributes>
          <attribute>
              <name>Number of pages 1</name>
              <value>123 1</value>
          </attribute>
          <attribute>
              <name>Author 1</name>
              <value>Rob dude 1</value>
          </attribute>
       </attributes>
       <contributors>
          <contributor>John 1</contributor>
          <contributor>Ryan 1</contributor>
       </contributors>
       <isbns>
            <isbn>12345a</isbn>
            <isbn>12345b</isbn>
       </isbns>
    </item>
    <item>
       <sku>abc 2</sku>
       <title>a book 2</title>
       <price>42 2</price>
       <attributes>
          <attribute>
              <name>Number of pages 2</name>
              <value>123 2</value>
          </attribute>
          <attribute>
              <name>Author 2</name>
              <value>Rob dude 2</value>
          </attribute>
       </attributes>
       <contributors>
          <contributor>John 2</contributor>
          <contributor>Ryan 2</contributor>
       </contributors>
       <isbns>
            <isbn>6789a</isbn>
            <isbn>6789b</isbn>
       </isbns>
    </item>
    </items>

示例3 XML ..

Sample 3 XML..

<items>
<item>
   <sku>abc 1</sku>
   <title>a book 1</title>
   <price>42 1</price>
   <attributes>
      <attribute>
          <name>Number of pages 1</name>
          <value>123 1</value>
      </attribute>
      <attribute>
          <name>Author 1</name>
          <value>Rob dude 1</value>
      </attribute>
   </attributes>
   <contributors>
      <contributor>John 1</contributor>
      <contributor>Ryan 1</contributor>
   </contributors>
   <isbns>
        <isbn>
            <name>isbn 1</name>
            <value>12345a</value>
        </isbn>
        <isbn>
            <name>isbn 2</name>
            <value>12345b</value>
        </isbn>
   </isbns>
</item>
<item>
   <sku>abc 2</sku>
   <title>a book 2</title>
   <price>42 2</price>
   <attributes>
      <attribute>
          <name>Number of pages 2</name>
          <value>123 2</value>
      </attribute>
      <attribute>
          <name>Author 2</name>
          <value>Rob dude 2</value>
      </attribute>
   </attributes>
   <contributors>
      <contributor>John 2</contributor>
      <contributor>Ryan 2</contributor>
   </contributors>
   <isbns>
        <isbn>
            <name>isbn 3</name>
            <value>6789a</value>
        </isbn>
        <isbn>
            <name>isbn 4</name>
            <value>6789b</value>
        </isbn>
   </isbns>
</item>
</items>


推荐答案

根据你的模糊问题,你所说的Excel 它用我自己的话来表示:将每个 / items / item 元素作为一行。从文件顺序来看,column-name是每个叶元素节点的标签名,如果有一个重复的名称,那么该位置是第一个。

According to your vague question, what you call "Excel" it does the following in my own words: It takes each /items/item element as a row. From that in document order, the column-name is the tag-name of each leaf-element-nodes, if there is a duplicate name, the position is of the first one.

然后,它每行创建一行,但只有所有子元素都是叶元素。否则,将行作为该行的行的基数,并且内插非元素元素。例如。如果这样的条目确实有两次具有相同名称的两个附加叶,那些被插入到两行。然后将他们的子值放置在列的位置,名称遵循第一段所述的逻辑。

Then it creates one row per row but only if all child-elements are leaf elements. Otherwise, the row is taken as base for the rows out of that row and non-leaf-element containing elements are interpolated. E.g. if such an entry does have two times two additional leafs with the same name, those get interpolated into two rows. Their child values are then placed into the position of the columns with the name following the logic described in the first paragraph.

这个逻辑跟着多深不清楚题。所以我只保留在该级别。否则插值将需要深入到树中。为此,所概述的算法可能不再适用。

How deep this logic is followed is not clear from your question. So I keep it on that level only. Otherwise the interpolation would need to recurse deeper into the tree. For that, the algorithm as outlined might not be fitting any longer.

要在PHP中构建,您可以特别受益于XPath和插值作为一个生成器

To build that in PHP, you can particularly benefit from XPath and the interpolation works wonders as a Generator.

function tree_to_rows(SimpleXMLElement $xml)
{
    $columns = [];

    foreach ($xml->xpath('/*/*[1]//*[not(*)]') as $leaf) {
        $columns[$leaf->getName()] = null;
    }

    yield array_keys($columns);

    $name = $xml->xpath('/*/*[1]')[0]->getName();

    foreach ($xml->$name as $source) {
        $rowModel       = array_combine(array_keys($columns), array_fill(0, count($columns), null));
        $interpolations = [];

        foreach ($source as $child) {
            if ($child->count()) {
                $interpolations[] = $child;
            } else {
                $rowModel[$child->getName()] = $child;
            }
        }

        if (!$interpolations) {
            yield array_values($rowModel);
            continue;
        }

        foreach ($interpolations as $interpolation) {
            foreach ($interpolation as $interpolationStep) {
                $row = $rowModel;
                foreach ($interpolationStep->xpath('(.|.//*)[not(*)]') as $leaf) {
                    $row[$leaf->getName()] = $leaf;
                }
                yield array_values($row);
            }
        }
    }
}

使用它可以像以下一样直接:

Using it then can be as straight forward as:

$xml  = simplexml_load_file('items.xml');
$rows = tree_to_rows($xml);
echo new TextTable($rows);

给出示范输出:

+-----+--------+-----+-----------------+----------+-----------+-----+
|sku  |title   |price|name             |value     |contributor|isbn |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |Number of pages 1|123 1     |           |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |Author 1         |Rob dude 1|           |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |                 |          |John 1     |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 1|a book 1|42 1 |                 |          |Ryan 1     |12345|
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |Number of pages 2|123 2     |           |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |Author 2         |Rob dude 2|           |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |                 |          |John 2     |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+
|abc 2|a book 2|42 2 |                 |          |Ryan 2     |6789 |
+-----+--------+-----+-----------------+----------+-----------+-----+

TextTable https://gist.github。 com / hakre / 5734770 允许在发电机上运行 - 如果您正在寻找该代码。

The TextTable is a slightly modified version from https://gist.github.com/hakre/5734770 allowing to operate on Generators - in case you're looking for that code.

这篇关于excel如何读取XML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆