如何在输出方法期间在数据集中生成动态路径 [英] How to generate dynamic path in dataset during the output method
问题描述
是否可以在Flink中创建动态DataSink输出路径?
Is there a way to create a dynamic DataSink output path in Flink?
DataSet的数据类型为Tuple2<String, String>
DataSet has data type as Tuple2<String, String>
当我们尝试使用流时,我有一种使用自定义Bucketer生成动态浴的方法,如下所示
When we tried using stream I had a way to generate dynamic bath using custom Bucketer like below
@Override
public Path getBucketPath(Clock clock, Path basePath, Tuple2<String, String> element) {
return new Path(basePath + "/schema=" + element.f0.toLowerCase().trim() + "/");
}
我想知道在DataSet中是否有类似的方法可用于生成自定义路径.
I would like to know is there a similar way to handle in DataSet for generating the custom path.
推荐答案
我戳了一下,没有发现类似的批处理内容.这意味着我认为您必须创建自己的OutputFormat
类,该类包装常规的FileOutputFormat
并使用相同的Bucketer接口进行存储.
I poked around a bit, and didn't find anything similar for batch processing. Which means I think you'd have to create your own OutputFormat
class that wraps a regular FileOutputFormat
and does bucketing, using the same Bucketer interface.
这篇关于如何在输出方法期间在数据集中生成动态路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!