Spring Batch-多文件资源-与单线程花费相同的时间? [英] Spring Batch -- Multi-File-Resource -- Takes same time as single Thread?

查看:85
本文介绍了Spring Batch-多文件资源-与单线程花费相同的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Spring Batch将数据从XML迁移到Oracle数据库.

I am using Spring Batch for data migration from XML to Oracle Database.

使用单线程执行时,过程大约需要80-90分钟才能插入20K用户.

With Single Thread execution, process takes 80-90 Mins to insert 20K users approx.

我希望将其减少一半以上,但是即使使用多文件资源,我也无法实现.

I want to reduce it to more than half but even using Multi File Resource, I am not able to achieve that.

我有一个要处理的XML,因此我只需添加

I have a single XML to be processed so I started simply by adding

  • 任务执行器,使Reader同步但无法获得收益.

所以我在做什么,我将XML拆分为多个XMLS,并想尝试使用多文件资源.这是配置.

So what I am doing, I split XML into multiple XMLS and want to try with Multi File Resource. Here is the configuration.

<batch:job id="importJob">

        <batch:step id="step1Master">
            <batch:partition handler="handler" partitioner="partitioner" />
        </batch:step>

</batch:job>

<bean id="handler"
        class="org.springframework.batch.core.partition.support.TaskExecutorPartitionHandler">
        <property name="taskExecutor" ref="taskExecutor" />
        <property name="step" ref="slaveStep" />
        <property name="gridSize" value="20" />
    </bean>

    <batch:step id="slaveStep">
        <batch:tasklet transaction-manager="transactionManager"
            allow-start-if-complete="true">
            <batch:chunk reader="reader" writer="writer"
                processor="processor" commit-interval="1000" skip-limit="1500000">
                <batch:skippable-exception-classes>
                    <batch:include class="java.lang.Exception" />
                </batch:skippable-exception-classes>
            </batch:chunk>


        </batch:tasklet>

    </batch:step>

    <bean id="taskExecutor"
        class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
        <property name="corePoolSize" value="100" />
        <property name="maxPoolSize" value="300" />
        <property name="allowCoreThreadTimeOut" value="true" />
    </bean>


    <bean id="partitioner"
        class="org.springframework.batch.core.partition.support.MultiResourcePartitioner"
        scope="step">
        <property name="keyName" value="inputFile" />
        <property name="resources"
            value="file:/.../*.xml" />
    </bean>



    <bean id="processor"
        class="...Processor"
        scope="step" />

    <bean id="reader" class="org.springframework.batch.item.xml.StaxEventItemReader"
        scope="step">
        <property name="fragmentRootElementName" value="user" />
        <property name="unmarshaller" ref="userDetailUnmarshaller" />
        <property name="resource" value="#{stepExecutionContext[inputFile]}" />
    </bean>

我的单个XML文件包含大约1000个用户,而我正在尝试拥有20个文件.

My Single XML file contains users around 1000 and I am trying by having 20 files.

我保留了commit-interval = 1000,因为每个文件有1000条记录要插入到数据库中.

I kept commit-interval=1000 as each file has 1000 records to be insert in DB.

是否需要相应调整提交间隔?

Do commit-interval needs to adjusted accordingly?

我正在使用ORACLE DB,我需要在那里进行任何池管理吗? 在JBOSS中配置的ORACLE DB当前池 最小池= 100 最大池= 300

I am using ORACLE DB, Do I need to do any pool management there. Current Pool of ORACLE DB configured in JBOSS Min Pool = 100 Max Pool = 300

我看到类似

17:01:50,553 DEBUG [Writer] (taskExecutor-11) [UserDetailWriter] | user added
17:01:50,683 DEBUG [Writer] (taskExecutor-15) [UserDetailWriter] | user added
17:01:51,093 DEBUG [Writer] (taskExecutor-11) [UserDetailWriter] | user added
17:01:59,795 DEBUG [Writer] (taskExecutor-12) [UserDetailWriter] | user added
17:02:00,385 DEBUG [Writer] (taskExecutor-12) [UserDetailWriter] | user added
17:02:00,385 DEBUG [Writer] (taskExecutor-12) [UserDetailWriter] | user added

似乎正在创建多个线程,但是我仍然看不到任何性能改进吗?

It seems multiple threads are being created but still I am not seeing any performance improvement here?

请提出我做错了什么?

推荐答案

仔细阅读本文档以进行并行处理

go through this documentation for parallel processing

http://docs.spring.io/spring-batch/trunk/reference/html/scalability.html#scalabilityParallelSteps

这篇关于Spring Batch-多文件资源-与单线程花费相同的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆