SpringBatch-如何将文件本身作为项目处理? [英] SpringBatch- How to process files itself as a Item?

查看:37
本文介绍了SpringBatch-如何将文件本身作为项目处理?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Spring Batch 开发的新手.我有以下要求.将有一个带有 zip 文件的 s3 源,每个 zipfiles 将包含多个 pdf 文件和 xml 文件.[例如:100 个 pdf 和 100 个 xml 文件](xml 文件将包含有关 pdf 的数据)批处理需要读取pdf文件及其关联的xml文件并将它们推送到rest service/db.

I am new to spring batch development. I have the following requirement. There will be a s3 source with zip files and each of the zipfiles will contain multiple pdf files and xml files.[Eg:100 pdfs and 100 xml files] (xml files will contain data about the pdf) Batch needs to read the pdf files and its associated xml file and push these to rest service/db.

当我查看示例时,大部分内容都涵盖了如何从文件中读取一行并对其进行处理.在这里,我将项目本身作为文件.我想按照设置读取一个 pdf 文件(以字节为单位)+ xml 文件(转换为 pojo)并将其一一推送到休息服务.

When I looked at examples, most of it all covered how to read a line from the file and process it. here I have the items itself as file. I want to read one pdf file(as bytes) + xml file(converted into pojo) as set and push this to rest service one by one.

现在,我在一个 tasklet 中完成所有的读取和处理.但我相信会有更好的解决方案来实现它.请提出建议,谢谢.

Right now, I am doing all the reading and processing inside a single tasklet. but I am sure there will be better solution to implement it. Please suggest and thank you.

推荐答案

面向块的处理模型要求您首先定义项是什么.在您的情况下,一种选择是将项目视为 PDF 文件(数据)及其关联的 XML 文件(元数据).您可以创建一个表示此类项目的类和一个自定义项目阅读器.一旦到位,您就可以在面向块的步骤中使用读取器以及将数据发送到您的 REST 端点的处理器或写入器.

The chunk-oriented processing model requires you to first define what an item is. In your case, one option is to consider an item as the PDF file (data) with its associated XML file (metadata). You can create a class that represents such an item and a custom item reader for it. Once that in place, you can use the reader in a chunk oriented step and a processor or writer that sends data to your REST endpoint.

这篇关于SpringBatch-如何将文件本身作为项目处理?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆