使用xml2解析小型网页会引发XML_PARSE_HUGE错误 [英] Parsing small web page with xml2 throws XML_PARSE_HUGE error

查看:268
本文介绍了使用xml2解析小型网页会引发XML_PARSE_HUGE错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我在R中的rNOMADS软件包的用户开始遇到意外错误:

Recently a user of my rNOMADS package in R began getting unexpected errors:

Error: Excessive depth in document: 256 use XML_PARSE_HUGE option [1]

我们已将此问题归结为以下命令:

We tracked the issue down to this command:

html.tmp <- xml2::read_html("http://nomads.ncep.noaa.gov/cgi-bin/filter_rap.pl?dir=%2Frap.20151120")

在该链接上,似乎要解析的网页不大于其他可以正常运行的网页,并且远远小于应使用XML_PARSE_HUGE选项的1兆字节的限制.此外,

Upon following the link, it appears that the web page to be parsed is no larger than other ones that work fine, and much less than the 1 megabyte limit that should require the XML_PARSE_HUGE option. Furthermore,

xml2::read_html

实际上实际上没有XML_PARSE_HUGE选项.唯一可能的解决方案,此处 ,不适用于正式的R包.

actually has no XML_PARSE_HUGE option anyway. The only other potential solution, described here, is not appropriate for an official R package.

此错误的原因是什么,是否有可能在不诉诸于官方CRAN信息库之外的解决方案的情况下解决该错误?

What is the cause of this error, and is it possible to resolve it without resorting to solutions outside the official CRAN repository?

推荐答案

到目前为止,我能做的最好的就是安装 shabbychef 的分叉版本的xml2,强制使用XML_PARSE_HUGE.您可以通过

The best I can do so far is to install shabbychef's forked version of xml2 that forces XML_PARSE_HUGE. You can install this version of xml2 via

library(drat)
drat:::add("shabbychef")
install.packages('xml2')

目前,如果您在rNOMADS中遇到XML_PARSE_HUGE错误,请使用此解决方法.

For the time being, please use this work around if you encounter XML_PARSE_HUGE errors in rNOMADS.

这篇关于使用xml2解析小型网页会引发XML_PARSE_HUGE错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆