DOM错误 - ID“someAnchor”已经在实体X中定义 [英] DOM Error - ID 'someAnchor' already defined in Entity, line X

查看:125
本文介绍了DOM错误 - ID“someAnchor”已经在实体X中定义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我尝试将HTML文档加载到PHP DOM中,我会发现以下错误:

 错误DOMDocument: :loadHTML()[domdocument.loadhtml]:ID someAnchor已经在Entity中定义,行:9 

I为什么不能解决以下是将HTML字符串加载到DOM中的一些代码。



首先不包含锚标签,第二个用一个。第二个文件产生错误。



希望您能够将其剪切并粘贴到脚本中,并运行它以查看相同的输出:

 <?php 
ini_set('display_errors',1);
error_reporting(E_ALL);


$ stringWithNoAnchor =<< EOT
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp: //www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< title>我的文档< / title>
< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>
< / head>
< body>
< h1> Hello< / h1>
< / body>
< / html>
EOT;

$ stringWithAnchor =<< EOT
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp:// www。 w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< title>我的文档< / title>
< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>
< / head>
< body>
< h1> Hello< / h1>
< a name =someAnchorid =someAnchor>< / a>
< / body>
< / html>
EOT;

class domGrabber
{
public $ _FileErrorStr ='';

/ **
* @ desc DOM对象工厂完成加载DOM对象的工作
* /
public function getLoadAsDOMObj($ htmlString)
{
$ this-> _FileErrorStr =''; // reset error container
$ xmlDoc = new DOMDocument();
set_error_handler(array($ this,'_FileErrorHandler')); //警告和错误被抑制
$ xmlDoc-> loadHTML($ htmlString);
restore_error_handler();
return $ xmlDoc;
}

/ **
* @ desc public,以便它可以从此类外面捕获错误
* /
public function _FileErrorHandler($ errno ,$ errstr,$ errfile,$ errline)
{
if($ this-> _FileErrorStr === null)
{
$ this-> _FileErrorStr = $ errstr ;
}
else {
$ this-> _FileErrorStr。=(PHP_EOL。$ errstr);
}
}
}

$ domGrabber = new domGrabber();
$ xmlDoc = $ domGrabber-> getLoadAsDOMObj($ stringWithNoAnchor);

echo'PHP Version:'。 phpversion()。'< br />'.\"\\\
;

echo'< pre>';
print $ xmlDoc-> saveXML();
echo'< / pre>'。\\\
;
if($ domGrabber-> _FileErrorStr)
{
echo'Error'。 $ domGrabber-> _FileErrorStr;
}



$ xmlDoc = $ domGrabber-> getLoadAsDOMObj($ stringWithAnchor);
echo'< pre>';
print $ xmlDoc-> saveXML();
echo'< / pre>'。\\\
;
if($ domGrabber-> _FileErrorStr)
{
echo'Error'。 $ domGrabber-> _FileErrorStr;
}

我在Firefox源代码视图中获取以下内容:

  PHP版本:5.2.9< br /> 
< pre><?xml version =1.0encoding =iso-8859-1standalone =yes?>
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtmlxmlns =http://www.w3.org/1999/xhtml>< head>< title> ;我的文档< / title>< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>< / head>< body>
< h1> Hello< / h1>
< / body>< / html>
< / pre>
< pre><?xml version =1.0encoding =iso-8859-1standalone =yes?>
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtmlxmlns =http://www.w3.org/1999/xhtml>< head>< title> ;我的文档< / title>< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>< / head>< body>
< h1> Hello< / h1>
< a name =someAnchorid =someAnchor>< / a>

< / body>< / html>
< / pre>
错误
DOMDocument :: loadHTML()[< a href ='domdocument.loadhtml'> domdocument.loadhtml< / a>]:ID someAnchor已在Entity中定义,行:9 $ b $那么DOM为什么要说某些Anchor已经被定义了?





更新:



我尝试了




  • 而不是使用loadHTML()方法,而是使用loadXML()方法,并修复它

  • 而不是使用id和名称我只使用id - 属性和修正它。



为了完成,请看这里的比较脚本:

 <?php 
ini_set('display_errors',1);
error_reporting(E_ALL);


$ stringWithNoAnchor =<< EOT
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp: //www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< title>我的文档< / title>
< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>
< / head>
< body>
< p> stringWithNoAnchor< / p>
< / body>
< / html>
EOT;

$ stringWithAnchor =<< EOT
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp:// www。 w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< title>我的文档< / title>
< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>
< / head>
< body>
< p> stringWithAnchor< / p>
< a name =someAnchorid =someAnchor>< / a>
< / body>
< / html>
EOT;

$ stringWithAnchorButOnlyIdAtt =<< EOT
<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional // ENhttp:// www。 w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< title>我的文档< / title>
< meta http-equiv =Content-Typecontent =text / html; charset = iso-8859-1/>
< / head>
< body>
< p> stringWithAnchorButOnlyIdAtt< / p>
< a id =someAnchor>< / a>
< / body>
< / html>
EOT;

class domGrabber
{
public $ _FileErrorStr ='';
public $ useHTMLMethod = TRUE;

/ **
* @ desc DOM对象工厂完成加载DOM对象的工作
* /
public function loadDOMObjAndWriteOut($ htmlString)
{
$ this-> _FileErrorStr ='';

$ xmlDoc = new DOMDocument();
set_error_handler(array($ this,'_FileErrorHandler')); //警告和错误被抑制


if($ this-> useHTMLMethod)
{
$ xmlDoc-> loadHTML($ htmlString);
}
else {
$ xmlDoc-> loadXML($ htmlString);
}


restore_error_handler();

echo< h1>;
echo($ this-> useHTMLMethod)? 'using xmlDoc-> loadHTML()':'using $ xmlDoc-> loadXML()';
echo< / h1>;
echo'< pre>';
print $ xmlDoc-> saveXML();
echo'< / pre>'。\\\
;
if($ this-> _FileErrorStr)
{
echo'Error'。 $这 - > _FileErrorStr;
}
}

/ **
* @ desc public,以便它可以从此类外面捕获错误
* /
public函数_FileErrorHandler($ errno,$ errstr,$ errfile,$ errline)
{
if($ this-> _FileErrorStr === null)
{
$ this-> ; _FileErrorStr = $ errstr;
}
else {
$ this-> _FileErrorStr。=(PHP_EOL。$ errstr);
}
}
}

$ domGrabber = new domGrabber();

echo'PHP Version:'。 phpversion()。'< br />'.\"\\\
;

$ domGrabber-> useHTMLMethod = TRUE; // DOM-> loadHTML
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithNoAnchor);
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithAnchor);
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithAnchorButOnlyIdAtt);

$ domGrabber-> useHTMLMethod = FALSE; //使用DOM-> loadXML
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithNoAnchor);
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithAnchor);
$ domGrabber-> loadDOMObjAndWriteOut($ stringWithAnchorButOnlyIdAtt);


解决方案

如果你正在加载XML文件(就是这样, XHTML是XML),那么您应该使用 DOMDocument :: loadXML() ,而不是 DOMDocument :: loadHTML()



在HTML中, name id 引入ID​​。所以你重复的idsomeAnchor,因此错误。



但是,W3C验证器允许以您显示的形式重复的ID < a id =someAnchorname =someAnchor>< ; / A> 。这可能是一个libmxl2的错误。



在这个错误报告,用户提出一个修补程序,仅将名称属性视为ID:


根据HTML和XHTML规范,只有一个元素的名称属性
与id属性共享名称空间。对于某些元素,可以认为
具有相同名称的多个实例没有意义,但是它们应该是
,但是在其他元素的id
中不会被视为与同一个命名空间属性。



请参阅 http://www.zvon.org/xxl/xhtmlReference/Output/Strict/attr_name.html 为所有
提供名称属性及其语义的元素。



If I try to load an HTML document into PHP DOM I get an error along the lines of:

Error DOMDocument::loadHTML() [domdocument.loadhtml]: ID someAnchor already defined in Entity, line: 9

I cannot work out why. Here is some code that loads an HTML string into DOM.

First without containing an anchor tag and second with one. The second document produces an error.

Hopefully you should be able to cut and paste it into a script and run it to see the same output:

<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);


$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
</body>
</html>
EOT;

$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<h1>Hello</h1>
<a name="someAnchor" id="someAnchor"></a>
</body>
</html>
EOT;

class domGrabber
    {
    public $_FileErrorStr = '';

    /**
    *@desc DOM object factory does the work of loading the DOM object
    */
    public function getLoadAsDOMObj($htmlString)
        {
        $this->_FileErrorStr =''; //reset error container
        $xmlDoc = new DOMDocument();
        set_error_handler(array($this, '_FileErrorHandler')); // Warnings and errors are suppressed
        $xmlDoc->loadHTML($htmlString);
        restore_error_handler();
        return $xmlDoc;
        }

    /**
    *@desc public so that it can catch errors from outside this class
    */
    public function _FileErrorHandler($errno, $errstr, $errfile, $errline)
        {
        if ($this->_FileErrorStr === null)
            {
            $this->_FileErrorStr = $errstr;
            }
        else    {
            $this->_FileErrorStr .= (PHP_EOL . $errstr);
            }
        }
    }

$domGrabber = new  domGrabber();
$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithNoAnchor );

echo 'PHP Version: '. phpversion() .'<br />'."\n";

echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."\n";
if ($domGrabber->_FileErrorStr)
    {
    echo 'Error'. $domGrabber->_FileErrorStr;
    }



$xmlDoc = $domGrabber->getLoadAsDOMObj($stringWithAnchor);
echo '<pre>';
print $xmlDoc->saveXML();
echo '</pre>'."\n";
if ($domGrabber->_FileErrorStr)
    {
    echo 'Error'. $domGrabber->_FileErrorStr;
    }

I get the following out put in my Firefox source code view:

PHP Version: 5.2.9<br />
<pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body>
<h1>Hello</h1>
</body></html>
</pre>
<pre><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"><head><title>My document</title><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /></head><body>
<h1>Hello</h1>
<a name="someAnchor" id="someAnchor"></a>

</body></html>
</pre>
Error
DOMDocument::loadHTML() [<a href='domdocument.loadhtml'>domdocument.loadhtml</a>]: ID someAnchor already defined in Entity, line: 9

So, why is DOM saying that someAnchor is already defined?


Update:

I experimented with both

  • Instead of using loadHTML() I used the loadXML() method - and that fixed it
  • Instead of having both id and name I used just id - Attribute and that fixed it.

See the comparison script here for the sake of completion:

<?php
ini_set('display_errors', 1);
error_reporting(E_ALL);


$stringWithNoAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithNoAnchor</p>
</body>
</html>
EOT;

$stringWithAnchor = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchor</p>
<a  name="someAnchor" id="someAnchor" ></a>
</body>
</html>
EOT;

$stringWithAnchorButOnlyIdAtt = <<<EOT
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My document</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
</head>
<body >
<p>stringWithAnchorButOnlyIdAtt</p>
<a id="someAnchor"></a>
</body>
</html>
EOT;

class domGrabber
    {
    public $_FileErrorStr = '';
    public $useHTMLMethod = TRUE;

    /**
    *@desc DOM object factory does the work of loading the DOM object
    */
    public function loadDOMObjAndWriteOut($htmlString)
        {
        $this->_FileErrorStr ='';

        $xmlDoc = new DOMDocument();
        set_error_handler(array($this, '_FileErrorHandler')); // Warnings and errors are suppressed


        if ($this->useHTMLMethod)
            {
            $xmlDoc->loadHTML($htmlString);
            }
        else    {
            $xmlDoc->loadXML($htmlString);
            }


        restore_error_handler();

        echo "<h1>";
        echo ($this->useHTMLMethod) ? 'using xmlDoc->loadHTML() ' : 'using $xmlDoc->loadXML()';
        echo "</h1>";
        echo '<pre>';
        print $xmlDoc->saveXML();
        echo '</pre>'."\n";
        if ($this->_FileErrorStr)
            {
            echo 'Error'. $this->_FileErrorStr;
            }
        }

    /**
    *@desc public so that it can catch errors from outside this class
    */
    public function _FileErrorHandler($errno, $errstr, $errfile, $errline)
        {
        if ($this->_FileErrorStr === null)
            {
            $this->_FileErrorStr = $errstr;
            }
        else    {
            $this->_FileErrorStr .= (PHP_EOL . $errstr);
            }
        }
    }

$domGrabber = new  domGrabber();

echo 'PHP Version: '. phpversion() .'<br />'."\n";

$domGrabber->useHTMLMethod = TRUE; //DOM->loadHTML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);

$domGrabber->useHTMLMethod = FALSE; //use DOM->loadXML
$domGrabber->loadDOMObjAndWriteOut($stringWithNoAnchor);
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchor );
$domGrabber->loadDOMObjAndWriteOut($stringWithAnchorButOnlyIdAtt);

解决方案

If you are loading XML files (that's the case, XHTML is XML), then you should use DOMDocument::loadXML(), not DOMDocument::loadHTML().

In HTML, both name and id introduce an ID. So you are repeating the id "someAnchor", hence the error.

However, the W3C validator allows repeated IDs in the form you show <a id="someAnchor" name="someAnchor"></a>. This may be a bug of libmxl2.

In this bug report for libxml2, a user proposes a patch to only consider the name attribute as an ID:

According to the HTML and XHTML specs, only the a element's name attribute shares name space with id attributes. For some of the elements it can be argued that multiple instances with the same name don't make sense, but they should nevertheless not be considered in the same namespace as other elements' id attributes.

See http://www.zvon.org/xxl/xhtmlReference/Output/Strict/attr_name.html for all the elements that take name attributes and their semantics.

这篇关于DOM错误 - ID“someAnchor”已经在实体X中定义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆