AWS Kinesis leaseOwner混淆 [英] AWS Kinesis leaseOwner confusion

查看:106
本文介绍了AWS Kinesis leaseOwner混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个非常简单的应用程序,它在具有2个工作人员的Spark集群上运行,使用带有2个分片的Kinesis。



然后,我检查了

解决方案

首先,只是一个友好的提醒;在应用程序的配置中使用主机名定义 workerID;它将为您提供更多用户友好的名称。



第二,您确定shard-000接收数据吗?也许您在使用者端设置了一个静态分区键,导致数据仅在shard-001上堆叠?


A very simple application running on a Spark cluster with 2 workers, using Kinesis with 2 shards.

And I check the Kinesis Streams Application State on DynamoDB (show in this screenshot) at region North Virginia.

I start and stop workers from time to time, and I just noticed, when the leaseOwner for 2 shards is the same worker, application works fine.

But when I stop the current leaseOwner (10.0.7.63), then there will be a owner switch and new owner will be the other worker (10.0.7.62), then my application pulls data and no data returned from Kinesis (but, the connection with Kinesis is still on).

My guess, is that when the owner is switched to another worker, the checkpoints on the new owner is not matching what is left inside Kinesis, and the pulling the data will get nothing.

Could anyone please explain a bit what's going on here? Am I guessing it right?

Thanks a lot.

解决方案

First of all, just a friendly reminder; define "workerID" in the configuration of your application with hostname; it will help you with more user friendly names.

Second, are you sure the shard-000 receives data? Maybe you've set a static partition key on consumer side and that is causing the data to stack on only shard-001?

这篇关于AWS Kinesis leaseOwner混淆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆