XML缓慢地构建树结构 [英] slow construction of tree structure from XML

查看:67
本文介绍了XML缓慢地构建树结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将XML文档解析为自己的结构,但是对于大型输入而言,构建它非常慢,是否有更好的方法呢?

I'm parsing an XML document into my own structure but building it is very slow for large inputs is there a better way to do it?

public static DomTree<String> createTreeInstance(String path) 
  throws ParserConfigurationException, SAXException, IOException {
    DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = docBuilderFactory.newDocumentBuilder();
    File f = new File(path);
    Document doc = db.parse(f);       
    Node node = doc.getDocumentElement(); 
    DomTree<String> tree = new DomTree<String>(node);
    return tree;
}

这是我的DomTree构造函数:

Here is my DomTree constructor:

    /**
     * Recursively builds a tree structure from a DOM object.
     * @param root
     */
    public DomTree(Node root){      
        node = root;        
        NodeList children = root.getChildNodes();
        DomTree<String> child = null;
        for(int i = 0; i < children.getLength(); i++){  
            child = new DomTree<String>(children.item(i));
            if (children.item(i).getNodeType() != Node.TEXT_NODE){
                super.children.add(child);
            }
        }
    }

更新:

我已使用100MB XML文件对createTreeInstance()方法进行了基准测试:

I have benchmarked the createTreeInstance() method using a 100MB XML file:

  • 创建docBuilderFactory ...完成[3ms]
  • 创建docBuilder ...完成[21ms]
  • 正在解析文件...完成[5646ms]
  • getDocumentElement ...完成[1ms]
  • 创建DomTree ...完成[17076ms]

更新:

正如John Doe在下面所建议的,使用SAX可能更合适-我以前从未使用过SAX,那么有什么好方法可以将我必须使用的SAX转换成什么?

As John Doe suggests below it may be more appropriate to use SAX - I have never used SAX before, so is there a good way to convert what I have to using SAX?

推荐答案

如果要解析大型XML,则不使用DOM,而应使用SAX,拉式解析器(如XPP3)或其他任何东西.

If you're parsing a large XML, you don't use DOM, you use SAX, a pull parser such as XPP3 or anything else.

问题在于您可能不会很方便地在内存中存储"XML树",您只会获取事件并进行相应的处理.但是,这将是明智的存储方式,您可以将元素映射到数据结构.

The problem is that you won't have an "XML tree" in memory which might be convenient, you only get events and deal with them accordingly. However it will be memory wise, and you can map to elements to your data structures.

这篇关于XML缓慢地构建树结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆