如何在StAX中跟踪解析大文件的进度? [英] How do I keep track of parsing progress of large files in StAX?
本文介绍了如何在StAX中跟踪解析大文件的进度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在使用StAX API处理大型(1TB)XML文件。假设我们有一个处理一些元素的循环:
I'm processing large (1TB) XML files using the StAX API. Let's assume we have a loop handling some elements:
XMLInputFactory fac = XMLInputFactory.newInstance();
XMLStreamReader reader = fac.createXMLStreamReader(new FileReader(inputFile));
while (true) {
if (reader.nextTag() == XMLStreamConstants.START_ELEMENT){
// handle contents
}
}
如何跟踪大型XML文件中的整体进度?从读取器获取偏移量适用于较小的文件:
How do I keep track of overall progress within the large XML file? Fetching the offset from reader works fine for smaller files:
int offset = reader.getLocation().getCharacterOffset();
但是作为整数偏移,它可能只适用于最大2GB的文件...
but being an Integer offset, it'll probably only work for files up to 2GB...
推荐答案
一个简单的 FilterReader
应该有效。
class ProgressCounter extends FilterReader {
long progress = 0;
@Override
public long skip(long n) throws IOException {
progress += n;
return super.skip(n);
}
@Override
public int read(char[] cbuf, int off, int len) throws IOException {
int red = super.read(cbuf, off, len);
progress += red;
return red;
}
@Override
public int read() throws IOException {
int red = super.read();
progress += red;
return red;
}
public ProgressCounter(Reader in) {
super(in);
}
public long getProgress () {
return progress;
}
}
这篇关于如何在StAX中跟踪解析大文件的进度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文