退出输入时,Spring Batch Stax XML读取作业不会结束 [英] Spring Batch Stax XML reading job is not ending when out of input

查看:145
本文介绍了退出输入时,Spring Batch Stax XML读取作业不会结束的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Spring Batch来设置一个处理可能非常大的XML文件的作业。我认为我已经适当地设置了它,但是在运行时我发现作业运行,处理它的输入,然后只是挂起一个执行状态(我可以通过查看JobRepository在JobRepository中的状态来确认)。 p>

我已经多次阅读批处理文档了,但是我没有看到任何明显的在输入时使作业停止配置,我缺少。



以下是我的申请背景的相关部分:

 < batch:job id =processPartnerUploadsrestartable =true> 
< batch:step id =processStuffHoldings>
< batch:tasklet>
< batch:chunk reader =stuffReaderwriter =stuffWritercommit-interval =1/>
< / batch:tasklet>
< / batch:step>
< / batch:job>

< bean id =stuffReaderclass =org.springframework.batch.item.xml.StaxEventItemReader>
< property name =fragmentRootElementNamevalue =stuff/>
< property name =resourcevalue =file:///path/to/file.xml/>
< property name =unmarshallerref =stuffUnmarshaller/>
< / bean>

< bean id =stuffUnmarshallerclass =org.springframework.oxm.jaxb.Jaxb2Marshaller>
< property name =contextPathvalue =com.company.project.xmlcontext/>
< / bean>

< bean id =stuffWriterclass =com.company.project.batch.StuffWriter/>

如果重要,StuffWriter只是一个记录将要写入的项目的类。



如果我错过了与Batch和/或Stax有关的一些重要细微差别,请告诉我。

解决方案

我已经为自己解决了这个问题,尽管我对自己必须做的事感到惊讶。通过StaxEventItemReader调试,我注意到,当到达文档的末尾时,moveCursorToNextFragment()方法中的内部循环将变为无限。以下是相关代码:

  while(true){
while(reader.peek()!= null& &!reader.peek()。isStartElement()){
reader.nextEvent();
}
if(reader.peek()== null){
return false;
}
QName startElementName =((StartElement)reader.peek())。getName();
if(startElementName.getLocalPart()。equals(fragmentRootElementName)){
if(fragmentRootElementNameSpace == null
|| startElementName.getNamespaceURI()。equals(fragmentRootElementNameSpace)){
返回true;
}
}
reader.nextEvent();
}

reader.peek()永远不会返回null。在我看来,这个代码应该检查以查看在peek()期间遇到的XMLEvent是否在文档的末尾,但由于StaxEventItemReader依赖于包装标准XMLEventReader的DefaultFragmentEventReader,因此不是那么简单。



我最后做的是基于StaxEventItemReader滚动我自己的ItemReader,但根本没有使用FragmentEventReader,然后将内循环代码调整为如下所示:

  if(reader.peek()。getEventType()== XMLStreamConstants.END_DOCUMENT){
return false;
}
reader.nextEvent();

完美运行并允许我的批处理作业在输入结束时转到COMPLETED。



但我真的很惊讶我必须这样做。我想知道我使用的流式XML库的底层实现是否有问题,但我使用的是Spring Batch依赖列表中引用的stax2-api-3.0.1.jar。



我还发现我并不孤单


I'm using Spring Batch to set up a job that will process a potentially very large XML file. I think I've set it up appropriately, but at runtime I'm finding that the job runs, processes its input, and then just hangs in an executing state (I can confirm by viewing the JobExecution's status in the JobRepository).

I've read through the Batch documentation several times but I don't see any obvious "make the job stop when out of input" configuration that I'm missing.

Here's the relevant portion of my application context:

<batch:job id="processPartnerUploads" restartable="true">
    <batch:step id="processStuffHoldings">
        <batch:tasklet>
            <batch:chunk reader="stuffReader" writer="stuffWriter" commit-interval="1"/>
        </batch:tasklet>        
    </batch:step>
</batch:job>

<bean id="stuffReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
  <property name="fragmentRootElementName" value="stuff" />
  <property name="resource" value="file:///path/to/file.xml" />
  <property name="unmarshaller" ref="stuffUnmarshaller" />
</bean>

<bean id="stuffUnmarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
    <property name="contextPath" value="com.company.project.xmlcontext"/>
</bean>

<bean id="stuffWriter" class="com.company.project.batch.StuffWriter" />

In case it matters, the "StuffWriter" is just a class that logs the items that would be written.

Please let me know if I've missed some important nuance involved with Batch and/or Stax.

解决方案

I've resolved this problem for myself, though I'm surprised by what I had to do. Debugging through StaxEventItemReader, I noticed that the inner loop in the moveCursorToNextFragment() method would go infinite when the end of my document was reached. Here's the relevant code:

while (true) {
    while (reader.peek() != null && !reader.peek().isStartElement()) {
        reader.nextEvent();
    }
    if (reader.peek() == null) {
        return false;
    }
    QName startElementName = ((StartElement) reader.peek()).getName();
    if (startElementName.getLocalPart().equals(fragmentRootElementName)) {
        if (fragmentRootElementNameSpace == null
    || startElementName.getNamespaceURI().equals(fragmentRootElementNameSpace)) {
           return true;
        }
     }
    reader.nextEvent();
 }

reader.peek() was never returning null. It seemed to me like this code should be checking to see if the XMLEvent encountered during peek() is at the end of the document, but this wasn't so simple due to the StaxEventItemReader's reliance on a DefaultFragmentEventReader wrapping the standard XMLEventReader.

What I wound up doing was rolling my own ItemReader based on StaxEventItemReader but without using a FragmentEventReader at all, and then adjusting the inner loop code to read like so:

        if (reader.peek().getEventType() == XMLStreamConstants.END_DOCUMENT) {
            return false;
        }
        reader.nextEvent();

That works perfectly and allows my batch job to go to COMPLETED at the end of input.

I'm really surprised that I had to do this, though. I wondered if the underlying implementation of the streaming XML libraries I was using was at fault, but I'm using stax2-api-3.0.1.jar as referenced in the Spring Batch dependency list.

I also found that I'm not alone.

这篇关于退出输入时,Spring Batch Stax XML读取作业不会结束的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆