Apache Flink 中的 Keyby 数据分布,逻辑还是物理运算符? [英] Keyby data distribution in Apache Flink, Logical or Physical Operator?

查看:41
本文介绍了Apache Flink 中的 Keyby 数据分布,逻辑还是物理运算符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 Apache Flink 文档,KeyBy 转换在逻辑上将流划分为不相交的分区.具有相同键的所有记录都分配到同一个分区.

According to the Apache Flink documentation, KeyBy transformation logically partitions a stream into disjoint partitions. All records with the same key are assigned to the same partition.

KeyBy 是 100% 逻辑转换吗?它不包括用于跨集群节点分布的物理数据分区吗?如果是这样,那么如何保证所有具有相同key的记录都被分配到同一个分区?

Is KeyBy 100% logical transformation? Doesn't it include physical data partitioning for distribution across the cluster nodes? If so, then how it can guarantee that all the records with the same key are assigned to the same partition?

例如,假设我们从 n 个节点的 Apache Kafka 集群获取分布式数据流.运行我们的流式作业的 Apache Flink 集群由 m 个节点组成.当 keyBy 转换应用于传入的数据流时,它如何保证逻辑数据分区?还是涉及跨集群节点的物理数据分区?

For instance, assuming that we are getting a distributed data stream from Apache Kafka cluster of n nodes. Apache Flink cluster running our streaming job consists of m nodes. When the keyBy transformation is applied on the incoming data stream, how does it guarantees logical data partitioning? Or does it involve physical data partitioning across the cluster nodes?

我似乎对逻辑和物理数据分区感到困惑.

It seems I am confused between logical and physical data partitioning.

推荐答案

所有可能的键的键空间被分成一定数量的键组.key group的数量(与maximum parallelism相同)是你在搭建Flink集群时可以设置的配置参数;默认值为 128.

The keyspace of all possible keys is divided into some number of key groups. The number of key groups (which is the same as the maximum parallelism) is a configuration parameter you can set when setting up a Flink cluster; the default value is 128.

每个键都只属于一个键组.当集群启动时,键组在任务管理器之间划分——如果集群是从检查点或保存点启动的,这些快照按键组索引,每个任务管理器加载键中键的状态已分配的组.

Each key belongs to exactly one key group. When a cluster is launched, the key groups are divided among the task managers -- and if the cluster is started from a checkpoint or savepoint, those snapshots are indexed by key group, and each task manager loads the state for the keys in the key groups it has been assigned.

在作业运行时,每个任务管理器都知道用于计算键的键选择器函数,以及键如何映射到键组.TM 还知道将关键组划分给任务管理器.这使得将每条消息路由到负责该消息密钥的任务管理器变得简单.

While a job is running, every task manager knows the key selector functions used to compute the keys, and how keys map onto key groups. The TMs also know the partitioning of key groups to task managers. This makes it straightforward to route each message to the task manager responsible for that message's key.

详情:

一个密钥所属的密钥组的计算大致如下:

The key group that a key belongs to is computed roughly like this:

Object key = the result of your KeySelector function;
int keyHash = key.hashCode();
int keyGroupId = MathUtils.murmurHash(keyHash) % maxParallelism;

在给定实际并行度和 maxParallelism 的情况下,应该将给定键组中的元素路由到的运算符实例的索引

The index of the operator instance to which elements from a given key group should be routed given the actual parallelism and maxParallelism is computed as

keyGroupId * parallelism/maxParallelism

实际代码在org.apache.flink.runtime.state.KeyGroupRangeAssignment 如果你想看一看.

The actual code is in org.apache.flink.runtime.state.KeyGroupRangeAssignment if you want to take a look.

一个主要的收获是键组是不相交的,它们跨越键空间.换句话说,不可能出现不属于其中一个密钥组的密钥.每个密钥只属于一个密钥组,每个密钥组都属于一个任务管理器.

One major takeaway is that the key groups are disjoint, and they span the keyspace. In other words, it's not possible for a key to come along that doesn't belong to one of the key groups. Every key belongs to exactly one of the key groups, and every key group belongs to one of the task managers.

这篇关于Apache Flink 中的 Keyby 数据分布,逻辑还是物理运算符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆