Apache Flink中的Keyby数据分布是逻辑操作员还是物理操作员? [英] Keyby data distribution in Apache Flink, Logical or Physical Operator?

查看：250 发布时间：2020/11/8 21:10:09 apache-flink distributed-computing flink-streaming data-partitioning

本文介绍了Apache Flink中的Keyby数据分布是逻辑操作员还是物理操作员?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

根据Apache Flink文档，KeyBy转换在逻辑上将流划分为不相交的分区.具有相同键的所有记录都分配给相同的分区.

According to the Apache Flink documentation, KeyBy transformation logically partitions a stream into disjoint partitions. All records with the same key are assigned to the same partition.

KeyBy是否100％进行逻辑转换?它不包括用于在群集节点之间分布的物理数据分区吗?如果是这样，那么如何保证所有具有相同键的记录都分配给相同的分区?

Is KeyBy 100% logical transformation? Doesn't it include physical data partitioning for distribution across the cluster nodes? If so, then how it can guarantee that all the records with the same key are assigned to the same partition?

例如，假设我们从n个节点的Apache Kafka集群中获取分布式数据流.运行我们的流作业的Apache Flink集群由m个节点组成.将keyBy转换应用于传入数据流时，如何保证逻辑数据分区?还是涉及跨群集节点的物理数据分区?

For instance, assuming that we are getting a distributed data stream from Apache Kafka cluster of n nodes. Apache Flink cluster running our streaming job consists of m nodes. When the keyBy transformation is applied on the incoming data stream, how does it guarantees logical data partitioning? Or does it involve physical data partitioning across the cluster nodes?

似乎我对逻辑和物理数据分区感到困惑.

It seems I am confused between logical and physical data partitioning.

Apache Flink中的Keyby数据分布是逻辑操作员还是物理操作员? [英] Keyby data distribution in Apache Flink, Logical or Physical Operator?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Apache Flink中的Keyby数据分布是逻辑操作员还是物理操作员? [英] Keyby data distribution in Apache Flink, Logical or Physical Operator?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭