巨大的XML文件,XSLT内存问题,Java& SAX ... [英] huge XML files, XSLT memory problems, Java & SAX...
问题描述
我有2个XML数据文件,我想同时从
中提取数据,并使用XSLT进行转换以生成报告。第一个文件是巨大的
,当XSLT在内存中构建DOM树时,它的空间不足。
我只需要几个元素分支原始的XML,所以我是
寻求为XSLT构建DOM的推荐方法,只需要我需要的
元素
。我正在编写一个调用Xalan的Java应用程序,并且今天下午在SAX解析器上读取
...我很确定这是一个常见的
问题,因此,可能有一个干净简单的方法,
但我还没有找到一个...
谢谢,
杰夫
I have 2 XML data files that I want to extract data from simultaneously
and transform with XSLT to generate a report. The first file is huge
and when XSLT builds the DOM tree in memory, it runs out of space.
I only need a few branches of elements from the original XML, so I am
seeking a recomended way of building a DOM for XSLT of only the
elements
that I need. I''m writing a Java application that invokes Xalan, and
reading up on SAX parsers this afternoon... I''m sure this is a common
problem, and as such, there is probably a clean and easy way to do it,
but I haven''t found that one yet...
thanks,
Jeff
推荐答案
Jeff Calico写道:
Jeff Calico wrote:
我有2个我希望同时从中提取数据的XML数据文件,并使用XSLT进行转换以生成报告。第一个文件是巨大的,当XSLT在内存中构建DOM树时,它的空间不足。
我只需要原始XML中的一些元素分支,所以我<寻求建立一个只为我需要的
元素的XSLT的DOM的推荐方法。我正在编写一个调用Xalan的Java应用程序,并且今天下午读取SAX解析器......我确定这是一个常见的问题,因此,可能有一个干净简单的方法,
但我还没找到一个...
谢谢,
杰夫
I have 2 XML data files that I want to extract data from simultaneously
and transform with XSLT to generate a report. The first file is huge
and when XSLT builds the DOM tree in memory, it runs out of space.
I only need a few branches of elements from the original XML, so I am
seeking a recomended way of building a DOM for XSLT of only the
elements
that I need. I''m writing a Java application that invokes Xalan, and
reading up on SAX parsers this afternoon... I''m sure this is a common
problem, and as such, there is probably a clean and easy way to do it,
but I haven''t found that one yet...
thanks,
Jeff
1.将xml文件保存在xml数据库中。
2.使用xquery仅检索所需的数据。
3.转换为样式表中的数据
3.瞧:更好的性能...因为xml数据库从一个庞大的集合中快速检索到你想要的
数据。
你只在内存中加载你需要的元素。
唯一的问题是大多数xml数据库都在开发中..
只是google it 。
1. Save your xml files in an xml-databases.
2. Use xquery to only retrieve the data you want.
3. transform that data with your stylesheet
3. voila : better performance... because a xml database retrieves the
data you want fast from a huge set.
And you only load the elements you need in memory.
The only problem is that most xml-databases are in development..
Just google it.
Jeff Calico写道:
Jeff Calico wrote:
并使用XSLT进行转换以生成报告。第一个文件是巨大的
当XSLT在内存中构建DOM树时,它用完了空间。
and transform with XSLT to generate a report. The first file is huge
and when XSLT builds the DOM tree in memory, it runs out of space.
这已成为FAQ。通常的答案是不使用DOM。
顺便问一下,你认为什么是大文件?
DOM应该工作到几百MB的XML如果
你的DOM拥有所有内存。
This has become a FAQ. The usual answer is to not use a DOM.
By the way, what do you consider a huge file ?
DOMs should work up until a few 100 MB of XML if
you have all the RAM for your your DOM.
Jeff Calico写道:
Jeff Calico wrote:
我只需要一些来自原始XML的元素分支,所以我正在寻找一种建议用于XSLT的DOM的方法,只需要我需要的
元素。
I only need a few branches of elements from the original XML, so I am
seeking a recomended way of building a DOM for XSLT of only the
elements
that I need.
SAX通过SAX过滤器选择您关注的信息
,如果需要内存模型,则选择SAX-to-DOM构建器。
请注意,DOM实现的效率可能不同;我曾写过
a DOM子集,每个节点只需要6个字的内存(不是
计算文本内容),而Xalan-j仍然使用我的DTM数据模型
内部因为它比传统的基于Java对象的DOM实现更有效(以及更好的
阻抗匹配XPath数据模型抽象)。
正如其他人所说:你认为什么是巨大的?超过实物
内存?超过_virtual_ memory?
SAX through a SAX filter that selects the information you''re concerned
with and thence into a SAX-to-DOM builder if you need an in-memory model.
Note that DOM implementations can vary in their efficiency; I once wrote
a DOM subset that required only six words of memory per node (not
counting text contents), and Xalan-j still uses my DTM data model
internally because it''s more efficient than a traditional
Java-object-based DOM implementation (as well as being a better
impedence match to the XPath data model abstraction).
As others have said: What do you consider "huge"? Exceeding physical
memory? Exceeding _virtual_ memory?
这篇关于巨大的XML文件,XSLT内存问题,Java& SAX ...的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!