如何在StAX中跟踪解析大文件的进度? [英] How do I keep track of parsing progress of large files in StAX?

查看:120
本文介绍了如何在StAX中跟踪解析大文件的进度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用StAX API处理大型(1TB)XML文件。假设我们有一个处理一些元素的循环:

I'm processing large (1TB) XML files using the StAX API. Let's assume we have a loop handling some elements:

XMLInputFactory fac = XMLInputFactory.newInstance();
 XMLStreamReader reader = fac.createXMLStreamReader(new FileReader(inputFile));
   while (true) {
       if (reader.nextTag() == XMLStreamConstants.START_ELEMENT){
            // handle contents
       }
}

如何跟踪大型XML文件中的整体进度?从读取器获取偏移量适用于较小的文件:

How do I keep track of overall progress within the large XML file? Fetching the offset from reader works fine for smaller files:

int offset = reader.getLocation().getCharacterOffset();

但是作为整数偏移,它可能只适用于最大2GB的文件...

but being an Integer offset, it'll probably only work for files up to 2GB...

推荐答案

一个简单的 FilterReader 应该有效。

class ProgressCounter extends FilterReader {
    long progress = 0;

    @Override
    public long skip(long n) throws IOException {
        progress += n;
        return super.skip(n);
    }

    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
        int red = super.read(cbuf, off, len);
        progress += red;
        return red;
    }

    @Override
    public int read() throws IOException {
        int red = super.read();
        progress += red;
        return red;
    }

    public ProgressCounter(Reader in) {
        super(in);
    }

    public long getProgress () {
        return progress;
    }
}

这篇关于如何在StAX中跟踪解析大文件的进度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆