在YARN群集(Cloudera)上执行Flink示例代码时,Kerberos身份验证出错 [英] Error with Kerberos authentication when executing Flink example code on YARN cluster (Cloudera)

查看:942
本文介绍了在YARN群集(Cloudera)上执行Flink示例代码时,Kerberos身份验证出错的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在YARN群集上运行Flink以运行示例代码(flink examples WordCount.jar),但遇到了以下安全身份验证错误.

I was trying Flink on YARN cluster to run the example code (flinkexamplesWordCount.jar) but am getting the below security authentication error.

org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

我不确定问题出在哪里以及我错过了做什么.我可以在同一个cloudera hadoop集群中运行spark或map-reduce作业,而不会出现任何问题.

I am not sure where the issue is and what is that I am missing to do. I could run spark or map-reduce jobs without any issue in the same cloudera hadoop cluster.

我确实在flink-conf.yaml中更新了hdfs-site.xml和core-site.xml的CONF文件路径(在Master和Worker节点中进行了更新),并且还导出了HADOOP_CONF_DIR路径.我也尝试在执行flink run命令时在HDFS文件路径中提供host:port.

I did update the CONF file paths for hdfs-site.xml and core-site.xml in the flink-conf.yaml (updated same in Master and Worker nodes) and also export the HADOOP_CONF_DIR path. Also I tried give the host:port in the HDFS file path when executing flink run command.

错误消息

    22:14:25,138 ERROR   org.apache.flink.client.CliFrontend                           - Error while running the command.
org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
        at org.apache.flink.client.program.Client.run(Client.java:413)
        at org.apache.flink.client.program.Client.run(Client.java:356)
        at org.apache.flink.client.program.Client.run(Client.java:349)
        at org.apache.flink.client.program.ContextEnvironment.execute(ContextEnvironment.java:63)
        at org.apache.flink.examples.java.wordcount.WordCount.main(WordCount.java:78)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:437)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:353)
        at org.apache.flink.client.program.Client.run(Client.java:315)
        at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:584)
        at org.apache.flink.client.CliFrontend.run(CliFrontend.java:290)
        at org.apache.flink.client.CliFrontend$2.run(CliFrontend.java:873)
        at org.apache.flink.client.CliFrontend$2.run(CliFrontend.java:870)
        at org.apache.flink.runtime.security.SecurityUtils$1.run(SecurityUtils.java:50)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.flink.runtime.security.SecurityUtils.runSecured(SecurityUtils.java:47)
        at org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:870)
        at org.apache.flink.client.CliFrontend.main(CliFrontend.java:922)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Cannot initialize task 'DataSink (CsvOutputFormat (path: hdfs://10.94.146.126:8020/user/qawsbtch/flink_out, delimiter:  ))': SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]

推荐答案

(我与原始问题的作者进行了私下交谈,以找出解决方案)

(I had a private conversation with the author of the original question to figure out this solution)

原始问题的注释中发布的日志文件表明,该作业是针对Flink的独立安装提交的.如果用户在所有工作节点上均已通过身份验证,则独立Flink当前仅支持访问受Kerberos保护的HDFS. 使用YARN上的Flink,只有在YARN上启动作业的用户才需要使用Kerberos进行身份验证.

The log files posted in the comments of the original question indicate that the job was submitted against a standalone installation of Flink. Standalone Flink currently only supports accessing Kerberos secured HDFS if the user is authenticated on all worker nodes. With Flink on YARN, only the user starting the job on YARN needs to be authenticated with Kerberos.

此外,在评论部分,还有另一个问题:

Also, in the comment section, there was another issue:

robert@cdh544-worker-0:~/hd22/flink-0.9.0$ ./bin/yarn-session.sh -n 2
20:39:50,563 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032
20:39:50,600 INFO  org.apache.flink.yarn.FlinkYarnClient                         - Using values:
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  TaskManager count = 2
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  JobManager memory = 1024
20:39:50,602 INFO  org.apache.flink.yarn.FlinkYarnClient                         -  TaskManager memory = 1024
20:39:51,708 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:52,710 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:53,712 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
20:39:54,714 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

问题是您在启用了YARN HA的Hadoop/YARN 2.6.0的群集上使用Flink 0.9.0(包括Hadoop 2.2.0). Flink的旧(2.2.0)Hadoop库无法正确读取用于HA设置的ResourceManager地址.

The problem is that you are using Flink 0.9.0 (with Hadoop 2.2.0 included) on a cluster with Hadoop/YARN 2.6.0 with YARN HA enabled. Flink's old (2.2.0) Hadoop library is not able to properly read the ResourceManager address for a HA setup.

下载Flink(使用Hadoop 2.6.0)将使其正常工作.

Downloading Flink (with Hadoop 2.6.0) will make it work.

这篇关于在YARN群集(Cloudera)上执行Flink示例代码时,Kerberos身份验证出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆