如何使用XMLReader读取具有未定义名称空间的XML文件? [英] How to read an XML file with an undefined namespace with XMLReader?

查看:137
本文介绍了如何使用XMLReader读取具有未定义名称空间的XML文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对解析XML文件比较陌生,并且正在尝试使用XMLReader读取大型XML文件.

I'm relatively new to parsing XML files and am attempting to read a large XML file with XMLReader.

<?xml version="1.0" encoding="UTF-8"?>
<ShowVehicleRemarketing environment="Production" lang="en-CA" release="8.1-Lite" xsi:schemaLocation="http://www.starstandards.org/STAR /STAR/Rev4.2.4/BODs/Standalone/ShowVehicleRemarketing.xsd">
  <ApplicationArea>
    <Sender>
      <Component>Component</Component>
      <Task>Task</Task>
      <ReferenceId>w5/cron</ReferenceId>
      <CreatorNameCode>CreatorNameCode</CreatorNameCode>
      <SenderNameCode>SenderNameCode</SenderNameCode>
      <SenderURI>http://www.example.com</SenderURI>
      <Language>en-CA</Language>
      <ServiceId>ServiceId</ServiceId>
    </Sender>
    <CreationDateTime>CreationDateTime</CreationDateTime>
    <Destination>
      <DestinationNameCode>example</DestinationNameCode>
    </Destination>
  </ApplicationArea>
...

我收到以下错误

ErrorException [警告]:XMLReader :: read()[xmlreader.read]:compress.zlib://D:/WebDev/example/local/public/../upload/example.xml.gz:2:名称空间错误:ShowVehicleRemarketing上schemaLocation的命名空间前缀xsi未定义

ErrorException [ Warning ]: XMLReader::read() [xmlreader.read]: compress.zlib://D:/WebDev/example/local/public/../upload/example.xml.gz:2: namespace error : Namespace prefix xsi for schemaLocation on ShowVehicleRemarketing is not defined

我到处搜索,找不到关于使用XMLReader读取具有名称空间的XML文件的有用信息-如果实际上这是我需要做的,我将如何定义名称空间..几乎没有帮助?链接到相关资源?

I've searched around and can't find much useful information on using XMLReader to read XML files with namespaces -- How would I go about defining a namespace, if that is in fact what I need to do.. little help? links to pertinent resources?

推荐答案

需要定义xsi命名空间.例如

There needs to be a definition of the xsi namespace. E.g.

<ShowVehicleRemarketing
  environment="Production"
  lang="en-CA"
  release="8.1-Lite"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.starstandards.org/STAR/STAR/Rev4.2.4/BODs/Standalone/ShowVehicleRemarketing.xsd"
>


更新:您可以编写用户定义的过滤器,然后让XMLReader 使用该过滤器,例如:


Update: You could write a user defined filter and then let the XMLReader use that filter, something like:

stream_filter_register('darn', 'DarnFilter');
$src = 'php://filter/read=darn/resource=compress.zlib://something.xml.gz';
$reader->open($src);

然后将compress.zlib包装程序读取的内容通过"DarnFilter"路由",该DarnFilter必须找到(第一个)可以插入xmlns:xsi声明的位置.但这很混乱,并且需要花一些钱才能做得到(例如,理论上存储桶A可以包含xs,存储桶B i:schem和存储桶C aLocation=")

The contents read by the compress.zlib wrapper is then "routed" through the DarnFilter which has to find the (first) location where it can insert the xmlns:xsi declaration. But this is quite messy and will take some afford to do it right (e.g. theoretically bucket A could contain xs, bucket B i:schem and bucket C aLocation=")

更新2:这是php中一个过滤器的临时示例,该示例插入xsi名称空间声明.大部分未经测试(与我进行的一项测试(;-)一起使用)并且没有记录.将其作为概念验证而非生产代码.

Update 2: here's an ad-hoc example of a filter in php that inserts the xsi namespace declaration. Mostly untested (worked with the one test I ran ;-) ) and undocumented. Take it as a proof-of-concept not production-code.

<?php
stream_filter_register('darn', 'DarnFilter');
$src = 'php://filter/read=darn/resource=compress.zlib://d:/test.xml.gz';

$r = new XMLReader;
$r->open($src);
while($r->read()) {
  echo '.';
}

class DarnFilter extends php_user_filter {
  protected $buffer='';
  protected $status = PSFS_FEED_ME;

  public function filter($in, $out, &$consumed, $closing)
  {
    while ( $bucket = stream_bucket_make_writeable($in) ) {
      $consumed += $bucket->datalen;
      if ( PSFS_PASS_ON == $this->status ) {
        // we're already done, just copy the content
        stream_bucket_append($out, $bucket);
      }
      else {
        $this->buffer .= $bucket->data;
        if ( $this->foo() ) {
          // first element found
          // send the current buffer          
          $bucket->data = $this->buffer;
          $bucket->datalen = strlen($bucket->data);
          stream_bucket_append($out, $bucket);
          $this->buffer = null;
          // no need for further processing
          $this->status = PSFS_PASS_ON;
        }
      }
    }
    return $this->status;
  }

  /* looks for the first (root) element in $this->buffer
  *  if it doesn't contain a xsi namespace decl inserts it
  */
  protected function foo() {
    $rc = false;
    if ( preg_match('!<([^?>\s]+)\s?([^>]*)>!', $this->buffer, $m, PREG_OFFSET_CAPTURE) ) {
      $rc = true;
      if ( false===strpos($m[2][0], 'xmlns:xsi') ) {
        echo ' inserting xsi decl ';
        $in = '<'.$m[1][0]
          . ' xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" '
          . $m[2][0] . '>';    
        $this->buffer = substr($this->buffer, 0, $m[0][1])
          . $in
          . substr($this->buffer, $m[0][1] + strlen($m[0][0]));
      }
    }
    return $rc;
  }
}


更新3:这是一个用C#编写的临时解决方案


Update 3: And here's an ad-hoc solution written in C#

XmlNamespaceManager nsmgr = new XmlNamespaceManager(new NameTable());
// prime the XMLReader with the xsi namespace
nsmgr.AddNamespace("xsi", "http://www.w3.org/2001/XMLSchema-instance");

using ( XmlReader reader = XmlTextReader.Create(
  new GZipStream(new FileStream(@"\test.xml.gz", FileMode.Open, FileAccess.Read), CompressionMode.Decompress),
  new XmlReaderSettings(),
  new XmlParserContext(null, nsmgr, null, XmlSpace.None)
)) {
  while (reader.Read())
  {
    System.Console.Write('.');
  }
}

这篇关于如何使用XMLReader读取具有未定义名称空间的XML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆