数据流分组依据->基于键的多个输出 [英] Dataflow GroupBy -> multiple outputs based on keys

查看:77
本文介绍了数据流分组依据->基于键的多个输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有一种简单的方法可以根据组密钥将GroupBy的输出重定向到多个输出文件中?

Is there any simple way that I can redirect the output of GroupBy into multiple output files based on Group keys?

Bin.apply(GroupByKey.<String, KV<Long,Iterable<TableRow>>>create())
.apply(ParDo.named("Print Bins").of( ... ) 
.apply(TextIO.Write.to(*Output file based on key*))

如果接收器是解决方案,请您给我一个示例代码吗?

If Sink is the solution, would you please share a sample code w/ me?

谢谢!

推荐答案

Beam 2.2将包含一个用于执行此操作的API- TextIO.write()。to(DynamicDestinations),请参见目前,如果您想使用此API,可以使用2.2.0-SNAPSHOT版本。请注意,此API是试验性的,可能会在Beam 2.3或更高版本中更改。

Beam 2.2 will include an API to do just that - TextIO.write().to(DynamicDestinations), see source. For now, if you'd like to use this API, you can use the 2.2.0-SNAPSHOT version. Note that this API is experimental and might change in Beam 2.3 or onwards.

这篇关于数据流分组依据-&gt;基于键的多个输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆