春天批处理 - 如何读取多个线程中的一个大文件? [英] Spring Batch - How to read one big file in multiple threads?

查看:190
本文介绍了春天批处理 - 如何读取多个线程中的一个大文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题:读取大小> 10 MB的文件,并使用Spring Batch将其加载到登台表中。如何在读取文件时保持状态,以便在失败时重新启动作业?

根据文档,FileItemReader不是线程安全的,如果我们试图使它安全,我们最终会失去可重启性。所以基本的问题是:


  1. 有没有办法以块的形式读取文件,每个线程都知道需要读哪个块?



  2. $ b $如果我们使读取同步,在这种情况下需要做什么更改才能重新启动工作?如果有人遇到类似的问题,或者有任何分析,它会如何帮助我们做出决定。

    另外,任何指针或示例代码都是值得赞赏的。 / p>

    解决方案

    多线程只有在您的线程同时执行不同的操作时才有用。例如,可以有两个线程在不同的CPU上运行。或者一个线程可以等待网络消息,而另一个线程正在绘制屏幕。

    但是在你的情况下,两个线程将会等待相同的IO设备,所以没有一点使用多个。



    另请参阅此问题通过多线程读取文件


    Problem: Read file of size > 10 MB and load it in staging table using Spring Batch. How can we maintain state while reading a file, in order to restart the job if it fails?

    As per the documentation the FileItemReader is not thread safe and if we try to make it thread safe, we end up loosing restartability. So basic questions are:

    1. Is there a way to read the file in blocks and each thread knows which block it needs to read?
    2. If we make the read synchronous, what changes are required to make the job restartable in this scenario?

    If anyone has faced similar issues or have any analysis of how it performs would help us take decision.

    Also, any pointers or sample codes are appreciated.

    解决方案

    Multithreading is only useful if your threads are doing different things at the same time. For example, you can have two threads running on separate CPUs. Or one thread can be waiting for a network message while the other is painting the screen.

    But in your case, both threads would be waiting for the same IO from the same device, so there's no point using more than one.

    See also this question Reading a file by multiple threads

    这篇关于春天批处理 - 如何读取多个线程中的一个大文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆