JSR 352:如何从分区步骤的每个分区的编写器收集数据? [英] JSR 352 :How to collect data from the Writer of each Partition of a Partitioned Step?
问题描述
因此,我在写入数据库的步骤中有2个分区.我想记录每个分区中写入的行数,获取总和,并将其打印到日志中;
So, I have 2 partitions in a step which writes into a database. I want to record the number of rows written in each partition, get the sum, and print it to the log;
我当时正在考虑在Writer中使用static
变量,并使用Step Context/Job Context在Step Listener的afterStep()
中获取它.但是,当我尝试它时,我得到了null
.我可以在Reader的close()
中获得这些值.
I was thinking of using a static
variable in the Writer and use Step Context/Job Context to get it in afterStep()
of the Step Listener. However when I tried it I got null
. I am able to get these values in close()
of the Reader.
这是正确的做法吗?还是应该使用分区收集器/还原器/分析器?
Is this the right way to go about it? Or should I use Partition Collector/Reducer/ Analyzer?
我在Websphere Liberty中使用Java批处理.而且我正在Eclipse中进行开发.
I am using a java batch in Websphere Liberty. And I am developing in Eclipse.
推荐答案
我当时正在考虑在Writer中使用静态变量,并使用Step Context/Job Context在Step Listener的afterStep()中获取它.但是,当我尝试它时,我得到了空值.
I was thinking of using a static variable in the Writer and use Step Context/Job Context to get it in afterStep() of the Step Listener. However when i tried it i got null.
ItemWriter 目前可能已被销毁,但我不确定.
The ItemWriter might already be destroyed at this point, but I'm not sure.
这是正确的做法吗?
Is this the right way to go about it?
是的,应该足够好了.但是,您需要确保所有分区共享总行数,因为批处理运行时会为每个分区维护一个 StepContext 克隆.您应该使用JobContext
.
Yes, it should be good enough. However, you need to ensure the total row count is shared for all partitions because the batch runtime maintains a StepContext clone per partition. You should rather use JobContext
.
我认为使用 PartitionCollector 和 PartitionAnalyzer 也是一个不错的选择.界面 PartitionCollector 具有方法collectPartitionData()
来收集来自其分区的数据.收集后,批处理运行时会将这些数据传递给 PartitionAnalyzer 进行分析.请注意,有
I think using PartitionCollector and PartitionAnalyzer is a good choice, too. Interface PartitionCollector has a method collectPartitionData()
to collect data coming from its partition. Once collected, batch runtime passes this data to PartitionAnalyzer to analyze the data. Notice that there're
- 每步N个PartitionCollector(每个分区1个)
- 每步N个StepContext(每个分区1个)
- 每步骤1个PartitionAnalyzer
可以通过 StepContext 的transientUserData
传递写入的记录.由于 StepContext 被保留用于其自己的步进分区,因此瞬态用户数据不会被其他分区覆盖.
The records written can be passed via StepContext's transientUserData
. Since the StepContext is reserved for its own step-partition, the transient user data won't be overwritten by other partition.
这是实现:
MyItemWriter :
@Inject
private StepContext stepContext;
@Override
public void writeItems(List<Object> items) throws Exception {
// ...
Object userData = stepContext.getTransientUserData();
stepContext.setTransientUserData(partRowCount);
}
MyPartitionCollector
@Inject
private StepContext stepContext;
@Override
public Serializable collectPartitionData() throws Exception {
// get transient user data
Object userData = stepContext.getTransientUserData();
int partRowCount = userData != null ? (int) userData : 0;
return partRowCount;
}
MyPartitionAnalyzer
private int rowCount = 0;
@Override
public void analyzeCollectorData(Serializable fromCollector) throws Exception {
rowCount += (int) fromCollector;
System.out.printf("%d rows processed (all partitions).%n", rowCount);
}
参考资料: JSR352 v1.0最终版本.pdf
这篇关于JSR 352:如何从分区步骤的每个分区的编写器收集数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!