Flink如何缩放热分区? [英] How does Flink scale for hot partitions?

查看:52
本文介绍了Flink如何缩放热分区?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果有一个用例,我需要加入两个流或从单个流中聚合某种度量标准,并且我使用键控流来对事件进行分区,那么Flink如何处理热分区的操作,在这些分区中数据可能不能容纳到内存中,需要跨分区拆分吗?

If I have a use case where I need to join two streams or aggregate some kind of metrics from a single stream, and I use keyed streams to partition the events, how does Flink handle the operations for hot partitions where the data might not fit into memory and needs to be split across partitions?

推荐答案

对于热分区,Flink不会自动执行任何操作.

Flink doesn't do anything automatic regarding hot partitions.

如果分区具有一致的热分区,则可以手动拆分分区并预先汇总拆分.

If you have a consistently hot partition, you can manually split it and pre-aggregate the splits.

如果您担心要避免由于一个分区的意外负载高峰而导致的内存不足错误,则可以使用溢出到磁盘的状态后端.

If your concern is about avoiding out-of-memory errors due to unexpected load spikes for one partition, you can use a state backend that spills to disk.

如果要进行更多动态数据路由/分区,请查看状态函数API 此博客文章的动态数据路由"部分.

If you want more dynamic data routing / partitioning, look at the Stateful Functions API or the Dynamic Data Routing section of this blog post.

如果要自动缩放,请参见自动缩放带有Ververica Platform Autopilot的Apache Flink .

If you want auto-scaling, see Autoscaling Apache Flink with Ververica Platform Autopilot.

这篇关于Flink如何缩放热分区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆