无法在亚马逊 emr 中使用 apache flink [英] Cannot use apache flink in amazon emr

查看:49
本文介绍了无法在亚马逊 emr 中使用 apache flink的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法在 Amazons EMR 中启动 Apache Flink 的纱线会话.我得到的错误信息是

I can not a start a yarn session of Apache Flink in Amazons EMR. The error message I get is

$ tar xvfj flink-0.9.0-bin-hadoop26.tgz
$ cd flink-0.9.0
$ ./bin/yarn-session.sh -n 4 -jm 1024 -tm 4096
...
Diagnostics: File file:/home/hadoop/.flink/application_1439466798234_0008/flink-conf.yaml does not exist
java.io.FileNotFoundException: File file:/home/hadoop/.flink/application_1439466798234_0008/flink-conf.yaml does not exist
...

我使用的是 Flink 0.9 版和 Amazons Hadoop 4.0.0 版.有什么想法或提示吗?

I am using Flink verision 0.9 and Amazons Hadoop version 4.0.0. Any ideas or hints?

完整日志可以在这里找到:https://gist.github.com/headmyshoulder/48279f06c1850c62c28c

The full log can be found here: https://gist.github.com/headmyshoulder/48279f06c1850c62c28c

推荐答案

来自日志:

文件系统方案是文件".这说明指定的 Hadoop 配置路径错误,系统使用默认的 Hadoop 配置值.Flink YARN 客户端需要将其文件存储在分布式文件系统中

The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the sytem is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system

Flink 读取 Hadoop 配置文件失败.它们要么是从环境变量中提取的,例如HADOOP_HOME,或者您可以在执行 YARN 命令之前在 flink-conf.yaml 中设置配置目录.

Flink failed to read the Hadoop configuration files. They are either picked up from the environment variables, e.g. HADOOP_HOME, or you can set the configuration dir in the flink-conf.yaml before you execute your YARN command.

Flink 需要读取 Hadoop 配置才能知道如何将 Flink jar 上传到集群文件系统,以便新创建的 YARN 集群可以访问它.如果 Flink 无法解析 Hadoop 配置,它会使用本地文件系统上传 jar.这意味着 jar 将放在您启动集群的机器上.因此,它无法从 Flink YARN 集群访问.

Flink needs to read the Hadoop configuration to know how to upload the Flink jar to the cluster file system such that the newly created YARN cluster can access it. If Flink fails to resolve the Hadoop configuration, it uses the local file system for uploading the jar. That means that the jar will be put on the machine you launch your cluster from. Thus, it won't be accessible from the Flink YARN cluster.

请查看 Flink 配置页面 了解更多信息.

Please see the Flink configuration page for more information.

edit:在 Amazong EMR 上,export HADOOP_CONF_DIR=/etc/hadoop/conf 让 Flink 发现 Hadoop 配置目录.

edit: On Amazong EMR, export HADOOP_CONF_DIR=/etc/hadoop/conf let's Flink discover the Hadoop configuration directory.

这篇关于无法在亚马逊 emr 中使用 apache flink的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆