XPath,XML命名空间和Java [英] XPath, XML Namespaces and Java

查看:84
本文介绍了XPath,XML命名空间和Java的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

过去的一天,我一直在尝试从以下文档中提取一个XML节点,并且无法掌握XML命名空间的细微差别以使其正常工作.

I've spent the past day attempting to extract a one XML node out of the following document and am unable to grasp the nuances of XML Namespaces to make it work.

该XML文件的发布量很大,因此这是与我有关的部分:

The XML file is to large to post in total so here is the portion that concerns me:

<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<XFDL xmlns="http://www.PureEdge.com/XFDL/6.5" xmlns:custom="http://www.PureEdge.com/XFDL/Custom" xmlns:designer="http://www.PureEdge.com/Designer/6.1" xmlns:pecs="http://www.PureEdge.com/PECustomerService" xmlns:xfdl="http://www.PureEdge.com/XFDL/6.5">
   <globalpage sid="global">
      <global sid="global">
         <xmlmodel xmlns:xforms="http://www.w3.org/2003/xforms">
            <instances>
               <xforms:instance id="metadata">
                  <form_metadata>
                     <metadataver version="1.0"/>
                     <metadataverdate>
                        <date day="05" month="Jul" year="2005"/>
                     </metadataverdate>
                     <title>
                        <documentnbr number="2062" prefix.army="DA" scope="army" suffix=""/>
                        <longtitle>HAND RECEIPT/ANNEX NUMBER </longtitle>
                     </title>

该文档继续进行,并一直向下完整地形成.我正在尝试从"documentnbr"标签(底部的三个)中提取"number"属性.

The document continues and is well formed all the way down. I am attempting to extract the "number" attribute from the "documentnbr" tag (three from the bottom).

我用于执行此操作的代码如下:

The code that I'm using to do this looks like this:

/***
     * Locates the Document Number information in the file and returns the form number.
     * @return File's self-declared number.
     * @throws InvalidFormException Thrown when XPath cannot find the "documentnbr" element in the file.
     */
    public String getFormNumber() throws InvalidFormException
    {
        try{
            XPath xPath = XPathFactory.newInstance().newXPath();
            xPath.setNamespaceContext(new XFDLNamespaceContext());

            Node result = (Node)xPath.evaluate(QUERY_FORM_NUMBER, doc, XPathConstants.NODE);
            if(result != null) {
                return result.getNodeValue();
            } else {
                throw new InvalidFormException("Unable to identify form.");
            }

        } catch (XPathExpressionException err) {
            throw new InvalidFormException("Unable to find form number in file.");
        }

    }

其中QUERY_FORM_NUMBER是我的XPath表达式,而XFDLNamespaceContext实现NamespaceContext,如下所示:

Where QUERY_FORM_NUMBER is my XPath expression, and XFDLNamespaceContext implements NamespaceContext and looks like this:

public class XFDLNamespaceContext implements NamespaceContext {

    @Override
    public String getNamespaceURI(String prefix) {
        if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
        else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
            return "http://www.PureEdge.com/XFDL/6.5";
        else if ("custom".equals(prefix))
            return "http://www.PureEdge.com/XFDL/Custom";
        else if ("designer".equals(prefix)) 
            return "http://www.PureEdge.com/Designer/6.1";
        else if ("pecs".equals(prefix)) 
            return "http://www.PureEdge.com/PECustomerService";
        else if ("xfdl".equals(prefix))
            return "http://www.PureEdge.com/XFDL/6.5";      
        else if ("xforms".equals(prefix)) 
            return "http://www.w3.org/2003/xforms";
        else    
            return XMLConstants.NULL_NS_URI;
    }

    @Override
    public String getPrefix(String arg0) {
        // TODO Auto-generated method stub
        return null;
    }

    @Override
    public Iterator getPrefixes(String arg0) {
        // TODO Auto-generated method stub
        return null;
    }

}

我尝试了许多不同的XPath查询,但我一直觉得应该可行:

I've tried many different XPath queries but I keep feeling like this should work:

protected static final String QUERY_FORM_NUMBER = 
        "/globalpage/global/xmlmodel/xforms:instances/instance" + 
        "/form_metadata/title/documentnbr[number]";

不幸的是,它不起作用,并且我不断得到null回报.

Unfortunately it does not work and I continually get a null return.

我在此处此处 ,以及此处,但没有任何内容证明有足够的启发性来帮助我完成这项工作.

I've done a fair amount of reading here, here, and here, but nothing has proved sufficiently illuminating to help me get this working.

当我弄清楚这一点时,我几乎肯定会脸色苍白,但是我真的很怀念我所缺少的东西.

I'm almost positive that I'm going to face-palm when I figure this out but I'm really at wit's end as to what I'm missing.

感谢您阅读所有这些内容,并在此先感谢您的帮助.

Thank you for reading through all of this and thanks in advance for the help.

-安迪

推荐答案

哈哈,我尝试调试您的表达式并使其正常工作.你错过了几件事.这个XPath表达式应该做到这一点:

Aha, I tried to debug your expression + got it to work. You missed a few things. This XPath expression should do it:

/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number

  1. 您需要包括根元素(在这种情况下为XFDL)
  2. 出于某种原因,我最终不需要在表达式中使用任何名称空间.不知道为什么.如果是这种情况,则永远不会调用NamespaceContext.getNamespaceURI().如果将instance替换为xforms:instance,则将getNamespaceURI()调用一次,并使用xforms作为输入参数,但是程序会引发异常.
  3. 属性值的语法为@attr,而不是[attr].
  1. You need to include the root element (XFDL in this case)
  2. I didn't end up needing to use any namespaces in the expression for some reason. Not sure why. If this is the case, then the NamespaceContext.getNamespaceURI() never gets called. If I replace instance with xforms:instance then getNamespaceURI() gets called once with xforms as the input argument, but the program throws an exception.
  3. The syntax for attribute values is @attr, not [attr].

我的完整示例代码:

import java.io.File;
import java.io.IOException;
import java.util.Collections;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;

import javax.xml.XMLConstants;
import javax.xml.namespace.NamespaceContext;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

public class XPathNamespaceExample {
    static public class MyNamespaceContext implements NamespaceContext {
        final private Map<String, String> prefixMap;
        MyNamespaceContext(Map<String, String> prefixMap)
        {
            if (prefixMap != null)
            {
                this.prefixMap = Collections.unmodifiableMap(new HashMap<String, String>(prefixMap));
            }
            else
            {
                this.prefixMap = Collections.emptyMap();
            }
        }
        public String getPrefix(String namespaceURI) {
            // TODO Auto-generated method stub
            return null;
        }
        public Iterator getPrefixes(String namespaceURI) {
            // TODO Auto-generated method stub
            return null;
        }
        public String getNamespaceURI(String prefix) {
                if (prefix == null) throw new NullPointerException("Invalid Namespace Prefix");
                else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX))
                    return "http://www.PureEdge.com/XFDL/6.5";
                else if ("custom".equals(prefix))
                    return "http://www.PureEdge.com/XFDL/Custom";
                else if ("designer".equals(prefix)) 
                    return "http://www.PureEdge.com/Designer/6.1";
                else if ("pecs".equals(prefix)) 
                    return "http://www.PureEdge.com/PECustomerService";
                else if ("xfdl".equals(prefix))
                    return "http://www.PureEdge.com/XFDL/6.5";      
                else if ("xforms".equals(prefix)) 
                    return "http://www.w3.org/2003/xforms";
                else    
                    return XMLConstants.NULL_NS_URI;
        }


    }

    protected static final String QUERY_FORM_NUMBER = 
        "/XFDL/globalpage/global/xmlmodel/xforms:instances/instance" + 
        "/form_metadata/title/documentnbr[number]";

    public static void main(String[] args) {
        try
        {
            DocumentBuilderFactory dbfac = DocumentBuilderFactory.newInstance();
            DocumentBuilder docBuilder = dbfac.newDocumentBuilder();
            Document doc = docBuilder.parse(new File(args[0]));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/@sid"));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/@id" ));
            System.out.println(extractNodeValue(doc, "/XFDL/globalpage/global/xmlmodel/instances/instance/form_metadata/title/documentnbr/@number" ));
        } catch (SAXException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
    }

    private static String extractNodeValue(Document doc, String expression) {
        try{

            XPath xPath = XPathFactory.newInstance().newXPath();
            xPath.setNamespaceContext(new MyNamespaceContext(null));

            Node result = (Node)xPath.evaluate(expression, doc, XPathConstants.NODE);
            if(result != null) {
                return result.getNodeValue();
            } else {
                throw new RuntimeException("can't find expression");
            }

        } catch (XPathExpressionException err) {
            throw new RuntimeException(err);
        }
    }
}

这篇关于XPath,XML命名空间和Java的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆