如何在输出方法期间在数据集中生成动态路径 [英] How to generate dynamic path in dataset during the output method

查看:72
本文介绍了如何在输出方法期间在数据集中生成动态路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在Flink中创建动态DataSink输出路径?

Is there a way to create a dynamic DataSink output path in Flink?

DataSet的数据类型为Tuple2<String, String>

DataSet has data type as Tuple2<String, String>

当我们尝试使用流时,我有一种使用自定义Bucketer生成动态浴的方法,如下所示

When we tried using stream I had a way to generate dynamic bath using custom Bucketer like below

@Override
    public Path getBucketPath(Clock clock, Path basePath, Tuple2<String, String> element) {
        return new Path(basePath + "/schema=" + element.f0.toLowerCase().trim() + "/");
    }

我想知道在DataSet中是否有类似的方法可用于生成自定义路径.

I would like to know is there a similar way to handle in DataSet for generating the custom path.

推荐答案

我戳了一下,没有发现类似的批处理内容.这意味着我认为您必须创建自己的OutputFormat类,该类包装常规的FileOutputFormat并使用相同的Bucketer接口进行存储.

I poked around a bit, and didn't find anything similar for batch processing. Which means I think you'd have to create your own OutputFormat class that wraps a regular FileOutputFormat and does bucketing, using the same Bucketer interface.

这篇关于如何在输出方法期间在数据集中生成动态路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆