如何在Java中高效地读取包含大量小项目的大型XML文件? [英] How to read large XML file consisting of large number of small items efficiently in Java?

查看:619
本文介绍了如何在Java中高效地读取包含大量小项目的大型XML文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个较大的XML文件,其中包含相对固定大小的项目,即

I have a large XML file that consists of relatively fixed size items i.e.

<rootElem>
  <item>...</item>

  <item>...</item>
  <item>...</item>
<rootElem>

元素相对较浅,通常相当小(<100 KB),但可能有很多(数十万)。这些项目是完全独立的。

The item elements are relatively shallow and typically rather small ( <100 KB), but there may be a lot of them (hundreds of thousands). The items are completely independent of each other.

如何在Java中有效地处理该文件?我无法以DOM形式读取整个文件,而且我不喜欢使用SAX,因为代码变得相当复杂。我想避免将文件拆分成较小的部分。

How could I process the file efficiently in Java? I can't read the whole file in as DOM, and I don't like to use SAX because the code gets rather complex. I'd like to avoid splitting the file to smaller pieces.

如果我可以将每个 项目 元素(一次一个)作为单独的DOM文档,我可以使用JAXB等工具进行处理。基本上我只想在所有项目中循环一次。

Optimal would be if I could obtain each item element, one at a time, as a separate DOM document, that I could process using tools like JAXB. Basically I just want to loop once over all the items.

我认为这是一个相当常见的问题。

I would think that this is a rather common problem.

推荐答案

Java 6有一个 StAX支持。它采用像SAX这样的流处理方式,但是使用了一种基于拉式的方法,可以实现更简单的处理代码。

Java 6 has a StAX support. It perfroms a stream processing like SAX, but uses a pull-based approach which leads to the simplier handling code.

这篇关于如何在Java中高效地读取包含大量小项目的大型XML文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆