要缓冲或字节流的拼图编写器 [英] Parquet Writer to buffer or byte stream

查看:34
本文介绍了要缓冲或字节流的拼图编写器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Java应用程序,可以将JSON消息转换为PARQUET格式。Java中有没有可以写入缓冲区或字节流的拼图写入器?在大多数示例中,我都看到过写入文件。

推荐答案

TLDR;您需要实现OutputFile,例如类似于:

的内容
import org.apache.parquet.io.OutputFile;
import org.apache.parquet.io.PositionOutputStream;

import java.io.BufferedOutputStream;
import java.io.IOException;

public class ParquetBufferedWriter implements OutputFile {

    private final BufferedOutputStream out;

    public ParquetBufferedWriter(BufferedOutputStream out) {
        this.out = out;
    }

    @Override
    public PositionOutputStream create(long blockSizeHint) throws IOException {
        return createPositionOutputstream();
    }

    private PositionOutputStream createPositionOutputstream() {
        return new PositionOutputStream() {
            @Override
            public long getPos() throws IOException {
                return 0;
            }

            @Override
            public void write(int b) throws IOException {
                out.write(b);
            }
        };
    }

    @Override
    public PositionOutputStream createOrOverwrite(long blockSizeHint) throws IOException {
        return createPositionOutputstream();
    }

    @Override
    public boolean supportsBlockSize() {
        return false;
    }

    @Override
    public long defaultBlockSize() {
        return 0;
    }

}

您的作者应该是这样的:

    ParquetBufferedWriter out = new ParquetBufferedWriter();
        try (ParquetWriter<Record> writer = AvroParquetWriter.
                <Record>builder(out)
                .withRowGroupSize(DEFAULT_BLOCK_SIZE)
                .withPageSize(DEFAULT_PAGE_SIZE)
                .withSchema(SCHEMA)
                .build()) {

            for (Record record : records) {
                writer.write(record);
            }
        } catch (IOException e) {
            throw new IllegalStateException(e);
        }

这篇关于要缓冲或字节流的拼图编写器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆