使用XmlSlurper解析(非常)大型的XML文件 [英] Parsing (very) large XML files with XmlSlurper

查看:673
本文介绍了使用XmlSlurper解析(非常)大型的XML文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Groovy很陌生,我正在尝试使用XmlSlurper读取一个(相当)大的XML文件(超过1Gb),由于它不构建整个DOM在内存中。



尽管如此,我还是得到了OutOfMemoryError:Java堆空间,这让我觉得显然有些事情我做错了。我尝试增加Xmx设置,但我宁愿解决问题,因为之后我可能不得不处理更大的文件。



以下是我使用的代码行:

  def posts = new XmlSlurper()。parse(new File(posts.xml))

有什么不对的提示?

预先感谢

>

Jérémie。

解决方案

Groovy的 XmlSlurper 是一个SAX解析器,但是将整个模型加载到内存中......

为避免OOM异常,您可能需要增加内存容量你说,使用 -Xmx 设置),或者你可以编写您自己的SAX解析器,以便从文档中获取所需的数据


I am kind of new to Groovy and I am trying to read a (quite) large XML file (more than 1Gb) using XmlSlurper, which is supposed to work wonders with large files due to the fact that it doesn't build the whole DOM in memory.

Nevertheless I keep getting "OutOfMemoryError : Java heap space" which makes me think that there obviously is something that I'm doing wrong. I tried increasing the Xmx setting but I would rather solve the problem since I may have to deal with even bigger files afterwards.

Here is the line of code I used:

def posts = new XmlSlurper().parse(new File("posts.xml"))

Any hint on what's wrong ?

Thanks in advance,

Jérémie.

解决方案

Groovy's XmlSlurper is a SAX parser, but loads the entire model into memory...

To avoid OOM exceptions, you probably need to either up your memory allowance (as you say, using the -Xmx setting), or you can write your own SAX parser to get just the data you require from the document

这篇关于使用XmlSlurper解析(非常)大型的XML文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆