加快xpath [英] Speeding up xpath

查看:57
本文介绍了加快xpath的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个1000个条目的文档,其格式如下:

I have a 1000 entry document whose format is something like:

<Example>
     <Entry>
          <n1></n1>
          <n2></n2>
      </Entry>
      <Entry>
          <n1></n1>
          <n2></n2>
      </Entry>
      <!--and so on-->

这里有1000多个Entry节点.我正在编写一个Java程序,该程序基本上一个接一个地获取所有节点,并对每个节点进行一些分析.但是问题是节点的检索时间随其数目的增加而增加.例如,检索第一个节点需要78毫秒,而检索第二个节点需要100毫秒,并且一直在增加.而要检索999节点,则需要花费5秒钟以上的时间.这非常慢.我们将把这段代码插入到XML文件中,该文件甚至有1000多个条目.有些像数百万.解析整个文档的总时间超过5分钟.

There are more than 1000 Entry nodes here. I am writing a Java program which basically gets all the node one by one and do some analyzing on each node. But the problem is that the retrieval time of the nodes increases with its no. For example it takes 78 millisecond to retrieve the first node 100 ms to retrieve the second and it keeps on increasing. And to retrieve the 999 node it takes more than 5 second. This is extremely slow. We would be plugging this code to XML files which have even more than 1000 entries. Some like millions. The total time to parse the whole document is more than 5 minutes.

我正在使用这个简单的代码来遍历它.这里的 nxp 是我自己的类,它具有从xpath获取节点的所有方法.

I am using this simple code to traverse it. Here nxp is my own class which has all the methods to get nodes from xpath.

nxp.fromXpathToNode("/Example/Entry" + "[" + i  + "]", doc);    

doc 是该文件的文档. i 是要检索的节点数.

and doc is the document for the file. i is the no of node to retrieve.

另外,当我尝试这样的事情

Also when i try something like this

List<Node> nl = nxp.fromXpathToNodes("/Example/Entry",doc);  
      content = nl.get(i);    

我遇到同样的问题.

任何人都无法解决如何加快节点迁移速度的问题,因此从XML文件中获取第一个节点和第1000个节点所花费的时间相同.

Anyone has any solution on how to speed up the tretirival of the nodes, so it takes the same amount of time to get the 1st node as well as the 1000 node from the XML file.

这是xpathtonode的代码.

Here is the code for xpathtonode.

public Node fromXpathToNode(String expression, Node context)  
{  
    try  
    {  
        return (Node)this.getCachedExpression(expression).evaluate(context, XPathConstants.NODE);  
    }  
    catch (Exception cause)  
    {  
        throw new RuntimeException(cause);  
    }  
}  

这是fromxpathtonodes的代码.

and here is the code for fromxpathtonodes.

public List<Node> fromXpathToNodes(String expression, Node context)  
{  
    List<Node> nodes = new ArrayList<Node>();  
    NodeList results = null;  
    
    try  
    {  
        results = (NodeList)this.getCachedExpression(expression).evaluate(context, XPathConstants.NODESET);  
          
        for (int index = 0; index < results.getLength(); index++)  
        {  
            nodes.add(results.item(index));  
        }  
    }  
    catch (Exception cause)  
    {  
        throw new RuntimeException(cause);  
    }  
    
    return nodes;  
}  

这是开始

public class NativeXpathEngine implements XpathEngine  
{      
private final XPathFactory factory;  
  
private final XPath engine;  

/**
 * Cache for previously compiled XPath expressions. {@link XPathExpression#hashCode()}
 * is not reliable or consistent so use the textual representation instead.
 */  
private final Map<String, XPathExpression> cachedExpressions;  
  
public NativeXpathEngine()  
{
    super();  
    
    this.factory = XPathFactory.newInstance();  
    this.engine = factory.newXPath();  
    this.cachedExpressions = new HashMap<String, XPathExpression>();  
}  

推荐答案

尝试 VTD-XML .它使用的内存少于DOM.它比SAX易于使用,并且支持XPath.这是一些示例代码,可以帮助您入门.它使用XPath来获取Entry元素,然后打印出n1和n2子元素.

Try VTD-XML. It uses less memory than DOM. It is easier to use than SAX and supports XPath. Here is some sample code to help you get started. It applies an XPath to get the Entry elements and then prints out the n1 and n2 child elements.

final VTDGen vg = new VTDGen();
vg.parseFile("/path/to/file.xml", false);

final VTDNav vn = vg.getNav();
final AutoPilot ap = new AutoPilot(vn);
ap.selectXPath("/Example/Entry");
int count = 1;
while (ap.evalXPath() != -1) {
    System.out.println("Inside Entry: " + count);

    //move to n1 child
    vn.toElement(VTDNav.FIRST_CHILD, "n1");
    System.out.println("\tn1: " + vn.toNormalizedString(vn.getText()));

    //move to n2 child
    vn.toElement(VTDNav.NEXT_SIBLING, "n2");
    System.out.println("\tn2: " + vn.toNormalizedString(vn.getText()));

    //move back to parent
    vn.toElement(VTDNav.PARENT);
    count++;
}

这篇关于加快xpath的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆