建议用Java解析这个XML [英] Suggestion to parse this XML in Java

查看:100
本文介绍了建议用Java解析这个XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不是Java新手;但是XML解析相对较新。我对很多XML工具有一点了解,但对其中任何一个都没有多少。我也不是XML-pro。

Not new to Java; but relatively new to XML-parsing. I know a tiny bit about a lot of the XML tools out there, but not much about any of them. I am also not an XML-pro.

我特别的问题是这个...我得到了一个我无法修改的XML文档,我只需要它将它的随机位解析为Java对象。只要合理,纯粹的速度并不是很重要的因素。同样,内存占用也不一定是绝对最优的,只是不是疯了。我只需要通读文档一次就可以解析它,之后我会将它扔进bitbucket然后只使用我的POJO。

My particular problem is this... I have been given an XML-document which I cannot modify and from which I need only to parse random bits of it into Java objects. Sheer speed is not much of a factor so long as it's reasonable. Likewise, memory-footprint need not be absolutely optimal either, just not insane. I only need to read through the document one time to parse it, after that I'll be throwing it in the bitbucket and just using my POJO.

所以,我我愿意接受建议......您会使用哪种工具?

那么,您是否会建议使用一些入门级代码来满足我的特殊需求?

So, I'm open to suggestion... which tool would you use?
And, would you kindly suggest a bit of starter-code to address my particular need?

以下是一个示例XML片段以及我正在尝试制作的相关POJO:

Here's a snippet of sample XML and the associated POJO I'm trying to craft:

<xml>
  <item id="...">
    ...
  </item>
  <metadata>
    <resources>

      <resource>
        <ittype>Service_Links</ittype>
        <links>
          <link>
            <path>http://www.stackoverflow.com</path>
            <description>Stack Overflow</description>
          </link>
          <link>
            <path>http://www.google.com</path>
            <description>Google</description>
          </link>
        </links>
      </resource>

      <resource>
        <ittype>Article_Links</ittype>
        <links>
          ...
        </links>
      </resource>

      ...

    </resources>
  </metadata>
</xml>


public class MyPojo {

    @Attribute(name="id")
    @Path("item")
    public String id;

    @ElementList(entry="link")
    @Path("metadata/resources/resource/links")
    public List<Link> links;
}

注意:此问题最初由这个问题,试图用SimpleXml来解决它;我想到也许有人可以建议一条不同的路线去解决同样的问题。

NOTE: this question was originally spawned by this question with me trying to solve it using SimpleXml; I'm to the point where I thought maybe someone could suggest a different route to solving the same problem.

另外注意:我真的希望有一个 CLEAN 解决方案...我的意思是,使用带有最少量代码的注释和/或xpath ...我想要的最后一件事是巨大的类文件,有大量笨拙的方法......那我已经有了......我正试图找到一个更好的方法。

Also Note: I'm really hoping for a CLEAN solution... by which I mean, using annotations and/or xpath with the least amount of code... the last thing I want is huge class file with huge unwieldy methods... THAT, I already have... I'm trying to find a better way.

:D

推荐答案

好的,所以我找到了一个解决方案(对我而言)似乎以最合理的方式解决了我的需求。我对其他建议表示道歉,但我更喜欢这条路线,因为它将大部分解析规则保留为注释,而且我必须编写的程序代码很少。

OK, so I settled on a solution that (to me) seemed to address my needs in the most reasonable way. My apologies to the other suggestions, but I just liked this route better because it kept most of the parsing-rules as annotations and what little procedural-code I had to write was very minimal.

我最终选择了JAXB;最初我认为JAXB要么从Java类创建XML,要么将XML解析为Java类,但只能使用XSD。然后我发现JAXB的注释可以将XML解析为没有XSD的Java类。

I ended up going with JAXB; initially I thought JAXB would either create XML from a Java-class or parse XML into a Java-class but only with an XSD. Then I discovered that JAXB has annotations that can parse XML into a Java-class without an XSD.

我正在使用的XML文件非常庞大而且很深,但我在这里和那里只需要点点滴滴;我担心将导航到未来的地方非常困难。所以我选择构建一个以XML格式建模的文件夹树...每个文件夹映射到一个元素,每个文件夹中都有一个表示该实际元素的POJO。

The XML-file I'm working with is huge and very deep, but I only need bits and bites of it here and there; I was worried that navigating what maps to where in the future would be very difficult. So I chose to structure a tree of folders modeled after the XML... each folder maps to an element and in each folder is a POJO representing that actual element.

问题是,有时会有一个元素具有多个级别的子元素,这个元素具有我关心的单个属性。为每个属性创建4个嵌套文件夹和POJO只是为了访问单个属性会很麻烦。但这就是你用JAXB做的事情(至少,从我能说的);我再次陷入困境。

Problem is, sometimes there is an element who has a child-element several levels down which has a single property I care about. It would be a pain to create 4 nested-folders and a POJO for each just to get access to a single property. But that's how you do it with JAXB (at least, from what I can tell); once again I was in a corner.

然后我偶然发现 EclipseLink的JAXB实现:Moxy
Moxy有一个@XPath注释,我可以放在那个父POJO中,用于导航几个级别以访问单个属性,而无需创建所有这些文件夹和元素-POJO。很好。

Then I stumbled on EclipseLink's JAXB-implementation: Moxy. Moxy has an @XPath annotation that I could place in that parent POJO and use to navigate several levels down to get access to a single property without creating all those folders and element-POJOs. Nice.

所以我创建了这样的东西:
(注意:我选择使用getter来处理需要按摩值的情况)

So I created something like this: (note: I chose to use getters for cases where I need to massage the value)

// maps to the root-"xml" element in the file
@XmlRootElement( name="xml" )
@XmlAccessorType( XmlAccessType.FIELD )
public class Xml {

    // this is standard JAXB
    @XmlElement;               
    private Item item;
    public Item getItem() {    
        return this.item;
    }

    ...
}

// maps to the "<xml><item>"-element in the file
public class Item {

    // standard JAXB; maps to "<xml><item id="...">"
    @XmlAttribute              
    private String id;
    public String getId() {
        return this.id;
    }

    // getting an attribute buried deep down
    // MOXY; maps to "<xml><item><rating average="...">"
    @XmlPath( "rating/@average" )    
    private Double averageRating;
    public Double getAverageRating() {
        return this.average;
    }

    // getting a list buried deep down
    // MOXY; maps to "<xml><item><service><identification><aliases><alias.../><alias.../>"
    @XmlPath( "service/identification/aliases/alias/text()" )
    private List<String> aliases;
    public List<String> getAliases() {
        return this.aliases;
    }

    // using a getter to massage the value
    @XmlElement(name="dateforindex")
    private String dateForIndex;
    public Date getDateForIndex() {
        // logic to parse the string-value into a Date
    }

}

另请注意,我采用了将XML对象与我在应用程序中实际使用的模型对象分开的路径。因此,我有一个工厂将这些原始对象转换为更强大的对象,我实际在我的应用程序中使用它。

Also note that I took the route of separating the XML-object from the model-object I actually use in the app. Thus, I have a factory that transforms these crude objects into much more robust objects which I actually use in my app.

这篇关于建议用Java解析这个XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆