问题-仅在文档开始处才允许XML声明 [英] Problem - XML declaration allowed only at the start of the document
问题描述
xml:19558:解析器错误:仅在文档开始处允许XML声明
xml:19558: parser error : XML declaration allowed only at the start of the document
有什么解决方案吗?我正在使用php XMLReader解析一个大型XML文件,但是却收到此错误.我知道该文件的格式不正确,但是我认为无法遍历该文件并删除这些多余的声明.所以任何想法,请帮助
any solutions? i am using php XMLReader to parse a large XML file, but getting this error. i know the file is not well formatted but i think its not possible to go through the file and remove these extra declarations. so any idea, PLEASE HELP
推荐答案
请确保第一个标记之前没有空格. 试试这个:
Make sure there isn't any white space before the first tag. Try this:
<?php
//Declarations
$file = "data.txt"; //The file to read from.
#Read the file
$fp = fopen($file, "r"); //Open the file
$data = ""; //Initialize variable to contain the file's content
while(!feof($fp)) //Loop through the file, read it till the end.
{
$data .= fgets($fp, 1024); //append next kb to data
}
fclose($fp); //Close file
#End read file
$split = preg_split('/(?<=<\/xml>)(?!$)/', $data); //Split each xml occurence into its own string
foreach ($split as $sxml) //Loop through each xml string
{
//echo $sxml;
$reader = new XMLReader(); //Initialize the reader
$reader->xml($sxml) or die("File not found"); //open the current xml string
while($reader->read()) //Read it
{
switch($reader->nodeType)
{
case constant('XMLREADER::ELEMENT'): //Read element
if ($reader->name == 'record')
{
$dataa = $reader->readInnerXml(); //get contents for <record> tag.
echo $dataa; //Print it to screen.
}
break;
}
}
$reader->close(); //close reader
}
?>
将$ file变量设置为所需的文件.请注意,我不知道这对于4gb文件的效果如何.告诉我是否可以.
Set the $file variable to the file you want. Note I don't know how well this will work for a 4gb file. Tell me if it doesn't.
这是另一种解决方案,它应与较大的文件(读取文件时解析)一起使用会更好.
Here is another solution, it should work better with the larger file (parses as it is reading the file).
<?php
set_time_limit(0);
//Declarations
$file = "data.txt"; //The file to read from.
#Read the file
$fp = fopen($file, "r") or die("Couldn't Open"); //Open the file
$FoundXmlTagStep = 0;
$FoundEndXMLTagStep = 0;
$curXML = "";
$firstXMLTagRead = false;
while(!feof($fp)) //Loop through the file, read it till the end.
{
$data = fgets($fp, 2);
if ($FoundXmlTagStep==0 && $data == "<")
$FoundXmlTagStep=1;
else if ($FoundXmlTagStep==1 && $data == "x")
$FoundXmlTagStep=2;
else if ($FoundXmlTagStep==2 && $data == "m")
$FoundXmlTagStep=3;
else if ($FoundXmlTagStep==3 && $data == "l")
{
$FoundXmlTagStep=4;
$firstXMLTagRead = true;
}
else if ($FoundXmlTagStep!=4)
$FoundXmlTagStep=0;
if ($FoundXmlTagStep==4)
{
if ($firstXMLTagRead)
{
$firstXMLTagRead = false;
$curXML = "<xm";
}
$curXML .= $data;
//Start trying to match end of xml
if ($FoundEndXMLTagStep==0 && $data == "<")
$FoundEndXMLTagStep=1;
elseif ($FoundEndXMLTagStep==1 && $data == "/")
$FoundEndXMLTagStep=2;
elseif ($FoundEndXMLTagStep==2 && $data == "x")
$FoundEndXMLTagStep=3;
elseif ($FoundEndXMLTagStep==3 && $data == "m")
$FoundEndXMLTagStep=4;
elseif ($FoundEndXMLTagStep==4 && $data == "l")
$FoundEndXMLTagStep=5;
elseif ($FoundEndXMLTagStep==5 && $data == ">")
{
$FoundEndXMLTagStep=0;
$FoundXmlTagStep=0;
#finished Reading XML
ParseXML ($curXML);
}
elseif ($FoundEndXMLTagStep!=5)
$FoundEndXMLTagStep=0;
}
}
fclose($fp); //Close file
function ParseXML ($xml)
{
//echo $sxml;
$reader = new XMLReader(); //Initialize the reader
$reader->xml($xml) or die("File not found"); //open the current xml string
while($reader->read()) //Read it
{
switch($reader->nodeType)
{
case constant('XMLREADER::ELEMENT'): //Read element
if ($reader->name == 'record')
{
$dataa = $reader->readInnerXml(); //get contents for <record> tag.
echo $dataa; //Print it to screen.
}
break;
}
}
$reader->close(); //close reader
}
?>
这篇关于问题-仅在文档开始处才允许XML声明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!