JSR 352:如何从分区步骤的每个分区的编写器收集数据? [英] JSR 352 :How to collect data from the Writer of each Partition of a Partitioned Step?

查看:173
本文介绍了JSR 352:如何从分区步骤的每个分区的编写器收集数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我在写入数据库的步骤中有2个分区.我想记录每个分区中写入的行数,获取总和,并将其打印到日志中;

So, I have 2 partitions in a step which writes into a database. I want to record the number of rows written in each partition, get the sum, and print it to the log;

我当时正在考虑在Writer中使用static变量,并使用Step Context/Job Context在Step Listener的afterStep()中获取它.但是,当我尝试它时,我得到了null.我可以在Reader的close()中获得这些值.

I was thinking of using a static variable in the Writer and use Step Context/Job Context to get it in afterStep() of the Step Listener. However when I tried it I got null. I am able to get these values in close() of the Reader.

这是正确的做法吗?还是应该使用分区收集器/还原器/分析器?

Is this the right way to go about it? Or should I use Partition Collector/Reducer/ Analyzer?

我在Websphere Liberty中使用Java批处理.而且我正在Eclipse中进行开发.

I am using a java batch in Websphere Liberty. And I am developing in Eclipse.

推荐答案

我当时正在考虑在Writer中使用静态变量,并使用Step Context/Job Context在Step Listener的afterStep()中获取它.但是,当我尝试它时,我得到了空值.

I was thinking of using a static variable in the Writer and use Step Context/Job Context to get it in afterStep() of the Step Listener. However when i tried it i got null.

ItemWriter 目前可能已被销毁,但我不确定.

The ItemWriter might already be destroyed at this point, but I'm not sure.

这是正确的做法吗?

Is this the right way to go about it?

是的,应该足够好了.但是,您需要确保所有分区共享总行数,因为批处理运行时会为每个分区维护一个 StepContext 克隆.您应该使用JobContext.

Yes, it should be good enough. However, you need to ensure the total row count is shared for all partitions because the batch runtime maintains a StepContext clone per partition. You should rather use JobContext.

我认为使用 PartitionCollector PartitionAnalyzer 也是一个不错的选择.界面 PartitionCollector 具有方法collectPartitionData()来收集来自其分区的数据.收集后,批处理运行时会将这些数据传递给 PartitionAnalyzer 进行分析.请注意,有

I think using PartitionCollector and PartitionAnalyzer is a good choice, too. Interface PartitionCollector has a method collectPartitionData() to collect data coming from its partition. Once collected, batch runtime passes this data to PartitionAnalyzer to analyze the data. Notice that there're

  • 每步N个PartitionCollector(每个分区1个)
  • 每步N个StepContext(每个分区1个)
  • 每步骤1个PartitionAnalyzer

可以通过 StepContext transientUserData传递写入的记录.由于 StepContext 被保留用于其自己的步进分区,因此瞬态用户数据不会被其他分区覆盖.

The records written can be passed via StepContext's transientUserData. Since the StepContext is reserved for its own step-partition, the transient user data won't be overwritten by other partition.

这是实现:

MyItemWriter :

@Inject
private StepContext stepContext;

@Override
public void writeItems(List<Object> items) throws Exception {
    // ...
    Object userData = stepContext.getTransientUserData();
    stepContext.setTransientUserData(partRowCount);
}

MyPartitionCollector

@Inject
private StepContext stepContext;

@Override
public Serializable collectPartitionData() throws Exception {

    // get transient user data
    Object userData = stepContext.getTransientUserData();
    int partRowCount = userData != null ? (int) userData : 0;
    return partRowCount;
}

MyPartitionAnalyzer

private int rowCount = 0;

@Override
public void analyzeCollectorData(Serializable fromCollector) throws Exception {
    rowCount += (int) fromCollector;
    System.out.printf("%d rows processed (all partitions).%n", rowCount);
}

参考资料: JSR352 v1.0最终版本.pdf

这篇关于JSR 352:如何从分区步骤的每个分区的编写器收集数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆