在允许DTD加载之前检查恶意XML? [英] Check for malicious XML before allowing DTD loading?

查看:122
本文介绍了在允许DTD加载之前检查恶意XML?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从libxml 2.9开始,在解析XML时已禁止加载外部实体,以防止 XXE攻击

Since libxml 2.9, loading external entities has been disabled when parsing XML, to prevent XXE attacks.

在这种情况下,为了能够在使用PHP的DOMDocument解析XML时加载DTD文件, LIBXML_DTDLOAD 必须指定。

In that case, to be able to load a DTD file when parsing the XML with PHP's DOMDocument, LIBXML_DTDLOAD must be specified.

什么是验证仅 预期DTD会很好的好方法?在启用 LIBXML_DTDLOAD 之前被加载?

What would be a good way to verify that only the expected DTD will be loaded, before enabling LIBXML_DTDLOAD?

我可以想到的一种方法(如下面的示例代码所示)将保持禁用实体加载,解析XML文件一次,检查DOCTYPE声明是否符合预期,然后在启用实体加载的情况下再次解析XML。足够吗?

One approach I can think of (as shown in the example code below) would be to keep entity loading disabled, parse the XML file once, check that the DOCTYPE declaration is as expected, then parse the XML again with entity loading enabled. Would that be sufficient?

<?php

$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<article/>
XML;

// entity loading disabled

libxml_disable_entity_loader();

$doc = new DOMDocument;
$doc->loadXML($xml, LIBXML_DTDLOAD); // PHP Warning:  DOMDocument::load(): I/O warning : failed to load external entity

print $doc->doctype->systemId; // http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd

// entity loading enabled

libxml_disable_entity_loader(false);

$doc = new DOMDocument;
$doc->loadXML($xml, LIBXML_DTDLOAD);

print $doc->doctype->systemId; // http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd


推荐答案


在启用 LIBXML_DTDLOAD

如果要过滤(白名单)预期的DTD,可以通过返回<$ c来禁用所有其他DDT您自己的可调用中的$ c> NULL ,已通过外部实体加载器 / libxml_set_external_entity_loader rel = nofollow> libxml_set_external_entity_loader

If you want to filter (whitelist) expected DTDs, you can do so by disabling all others by returning NULL from your own callable that has been set as external entity loader via libxml_set_external_entity_loader.

即,您将使用 LIBXML_DTDLOAD 标志,然后解析为 资源句柄 在功能中,以防DTD被列入白名单。否则,您返回 NULL

That is, you would use the LIBXML_DTDLOAD flag and then resolve to a resource handle in your function in case the DTD is white-listed. In case not, you return said NULL.

<?php
/**
 * @link http://stackoverflow.com/q/24526493/367456
 */

$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd">
<article/>
XML;

/* own entity loader */
libxml_set_external_entity_loader(function() {
  var_dump(func_get_args()); // just for demonstrating purposes
  return NULL;
});

$doc = new DOMDocument;
$doc->loadXML($xml, LIBXML_DTDLOAD);

echo "----\n";

/* restore default entity loader */    
libxml_set_external_entity_loader(NULL);

$doc = new DOMDocument;
$doc->loadXML($xml, LIBXML_DTDLOAD);

示例输出:

array(3) {
  [0]=>
  string(66) "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN"
  [1]=>
  string(66) "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd"
  [2]=>
  array(4) {
    ["directory"]=>
    string(1) "/"
    ["intSubName"]=>
    string(7) "article"
    ["extSubURI"]=>
    string(66) "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd"
    ["extSubSystem"]=>
    string(66) "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN"
  }
}

Warning: DOMDocument::loadXML(): Failed to load external entity "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.0 20120330//EN" in Entity, line: 2 in /in/jemmH on line 18
----

Warning: DOMDocument::loadXML(): php_network_getaddresses: getaddrinfo failed: Name or service not known in /in/jemmH on line 25

Warning: DOMDocument::loadXML(http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd): failed to open stream: php_network_getaddresses: getaddrinfo failed: Name or service not known in /in/jemmH on line 25

Notice: DOMDocument::loadXML(): failed to load external entity "http://jats.nlm.nih.gov/publishing/1.0/JATS-journalpublishing1.dtd" in Entity, line: 2 in /in/jemmH on line 25

这篇关于在允许DTD加载之前检查恶意XML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆