Spark Structured Streaming with Secure Kafka throwing : 无权访问组异常 [英] Spark Structured Streaming with secured Kafka throwing : Not authorized to access group exception

查看:33
本文介绍了Spark Structured Streaming with Secure Kafka throwing : 无权访问组异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了在我的项目中使用结构化流,我正在 hortonworks 2.6.3 环境中测试 spark 2.2.0 和 Kafka 0.10.1 与 Kerberos 的集成,我在示例代码下运行以检查集成.我能够在 Spark 本地模式下在 IntelliJ 上运行以下程序而没有任何问题,但是当在 Hadoop 集群上移动到纱线集群/客户端模式时,相同的程序会抛出以下异常.

In order to use structured streaming in my project, I am testing spark 2.2.0 and Kafka 0.10.1 integration with Kerberos on my hortonworks 2.6.3 environment, I am running below sample code to check the integration. I am able to run the below program on IntelliJ on spark local mode with no issues, but the same program when moved to yarn cluster/client mode on Hadoop cluster it get throws below exception.

我知道我可以为 group-id 配置 kafka acl,但是 spark 结构化流会为每个查询生成新的 group-id,因此我无法在 kafka acl 中配置 group-id 以摆脱授权异常.我很善良现在卡住了.

I know I can configure kafka acl for group-id, but spark structured streaming generates new group-id for every query, hence I cannot configure group-id in kafka acl in order to get rid of authorization exception.I am kind of stuck now.

14:19:59 org.apache.spark.sql.streaming.StreamingQueryException:无权访问组:spark-kafka-source-632450e3-a111-4d09-8704-85320c572aeb--1213729126-driver-2

例外:

18/01/31 14:46:34 INFO AbstractLogin: Successfully logged in.
18/01/31 14:46:34 INFO KerberosLogin: TGT refresh thread started.
18/01/31 14:46:34 INFO KerberosLogin: TGT valid starting at: Wed Jan 31 13:51:11 UTC 2018
18/01/31 14:46:34 INFO KerberosLogin: TGT expires: Wed Jan 31 23:51:14 UTC 2018
18/01/31 14:46:34 INFO KerberosLogin: TGT refresh sleeping until: Wed Jan 31 21:58:11 UTC 2018
Exception in thread "main" 18/01/31 14:46:34 INFO AppInfoParser: Kafka version : 0.10.1.2.6.3.0-235
18/01/31 14:46:34 INFO AppInfoParser: Kafka commitId : ba0af6800a08d2f8
org.apache.spark.sql.streaming.StreamingQueryException: Not authorized to access group: spark-kafka-source-632450e3-a111-4d09-8704-85320c572aeb--1213729126-driver-2
=== Streaming Query ===
Identifier: [id = 64a8dbd2-c674-43f7-947d-9aac1667b2b0, runId = 70ce5ee9-ead6-44eb-a7cd-93619b10b811]
Current Committed Offsets: {}
Current Available Offsets: {}

Current State: ACTIVE
Thread State: RUNNABLE

Logical Plan:
Project [value#16]
+- Project [cast(key#0 as string) AS key#15, cast(value#1 as string) AS value#16]
   +- StreamingExecutionRelation KafkaSource[Subscribe[test_topic]], [key#0, value#1, topic#2, partition#3, offset#4L, timestamp#5, timestampType#6]

        at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches(StreamExecution.scala:343)
        at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:206)
Caused by: org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: spark-kafka-source-632450e3-a111-4d09-8704-85320c572aeb--1213729126-driver-2
18/01/31 14:46:34 ERROR StreamExecution: Query [id = 01bd97ea-6d2c-446c-a366-491d252925aa, runId = cc8dc932-9297-47c5-b30b-007624c03163] terminated with error
org.apache.kafka.common.errors.GroupAuthorizationException: Not authorized to access group: spark-kafka-source-d690d270-7092-4aed-82c2-97fdfd80d0ed--604732661-driver-2
18/01/31 14:46:34 WARN KerberosLogin: TGT renewal thread has been interrupted and will exit.
18/01/31 14:46:34 INFO SparkContext: Invoking stop() from shutdown hook
18/01/31 14:46:34 INFO AbstractConnector: Stopped Spark@37524c9b{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/01/31 14:46:34 INFO SparkUI: Stopped Spark web UI at http://192.168.0.19:4040
18/01/31 14:46:34 INFO YarnClientSchedulerBackend: Interrupting monitor thread
18/01/31 14:46:34 INFO YarnClientSchedulerBackend: Shutting down all executors
18/01/31 14:46:34 INFO YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down

推荐答案

有一种使用通配符的解决方案.

There is a way to use wildcard solution.

bin/kafka-acls --authorizer kafka.security.auth.SimpleAclAuthorizer 
               --authorizer-properties zookeeper.connect=zk:2181 
               --add --allow-principal User:'Bon' --operation READ 
               --topic topicName --group='spark-kafka-source-' 
               --resource-pattern-type prefixed

希望有帮助!

这篇关于Spark Structured Streaming with Secure Kafka throwing : 无权访问组异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆