如何解决设置协调器oozie中文件不存在的错误 [英] How to solve the error when file doesn't exist in setting coordinator oozie

查看:43
本文介绍了如何解决设置协调器oozie中文件不存在的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

设置协调器oozie时出现错误文件不存在的解决方法:

How to solution when error file doesnt exist in setting coordinator oozie:

我在日志协调器中有错误:

I have error in log coodinator:

猪日志文件转储:

错误:java.io.FileNotFoundException:文件不存在:/user/hdfs/jay/part-0.tmp

Error: java.io.FileNotFoundException: File does not exist: /user/hdfs/jay/part-0.tmp

设置协调员:

<coordinator-app name="tes-ng" frequency="${coord:minutes(15)}"
start="2015-12-07T10:30+0700" end="2017-02-28T23:00+0700" timezone="Asia/Jakarta"
xmlns="uri:oozie:coordinator:0.1" xmlns:sla="uri:oozie:sla:0.1">
<controls>
    <execution>LAST_ONLY</execution>
</controls>
<datasets>
    <dataset name="INPUT_DS" frequency="${coord:minutes(15)}"
        initial-instance="2015-02-16T016:00+0700" timezone="Asia/Jakarta">
        <uri-template>${nameNode}/user/hdfs/jay/${YEAR}/${MONTH}/${DAY}/${HOUR}${MINUTE}
        </uri-template>
        <done-flag></done-flag>
    </dataset>
    <dataset name="OUTPUT_DS" frequency="${coord:minutes(15)}"
        initial-instance="2015-02-16T16:00+0700" timezone="Asia/Jakarta">
        <uri-template>${nameNode}/user/hdfs/jay/output</uri-template>
        <done-flag></done-flag>
    </dataset>
</datasets>
<input-events>
    <data-in name="INPUT" dataset="INPUT_DS">
        <instance>${coord:current(-2)}</instance>
    </data-in>
</input-events>
<output-events>
    <data-out name="OUTPUT" dataset="OUTPUT_DS">
        <instance>${coord:current(-2)}</instance>
    </data-out>
</output-events>
<action>
    <workflow>
        <app-path>${appFolder}</app-path>
        <configuration>
            <property>
                <name>INPUT</name>
                <value>${coord:dataIn('INPUT')}</value>
            </property>
            <property>
                <name>OUTPUT</name>
                <value>${coord:dataOut('OUTPUT')}</value>
            </property>
        </configuration>
    </workflow>
</action>

我想要的是当我得到错误文件不存在时,oozie 可以一直保持直到文件准备好.有什么想法..??

What I want is when I get error File does not exist, oozie can hold until file is all ready. any idea..??

谢谢.

推荐答案

通常的做法是拥有适当的数据依赖.创建输入数据的过程会创建一个文件,表明数据存在(例如 _SUCCESS).如果您在输入数据集中定义 a(例如 _SUCCESS),Oozie 将定期检查此文件是否存在,并仅在可用时启动工作流.

The ususal way to do this is to have a proper data dependency. The process that creates your input data creates a file that signales that the data is present (e.g. _SUCCESS). If you define a in your input dataset (e.g. _SUCCESS), Oozie will periodically check for existance of this file and only start the workflow when it is available.

<dataset name="INPUT_DS" frequency="${coord:minutes(15)}"
    initial-instance="2015-02-16T016:00+0700" timezone="Asia/Jakarta">
    <uri-template>${nameNode}/user/hdfs/jay/${YEAR}/${MONTH}/${DAY}/${HOUR}${MINUTE}
    </uri-template>
    <done-flag>_SUCCESS</done-flag>
</dataset>

如果您没有这样的标志,那么 AFAIK 唯一的选择就是编写您自己的输入数据检查并将其插入 Oozie(我见过有人为 Hive 分区这样做).

If you cannot have such a flag, then AFAIK the only option is to write your own input data check and plug it into Oozie (I've seen someone do that for Hive partitions).

您还应该仔细检查初始实例值,因为您似乎在其中放置了一个偏移量,然后在其顶部指定了 timezone=Asia/Jakarta.

You should also double check the initial-instance value as it seems you've put an offset in there and then specified timezone=Asia/Jakarta on top of it.

这篇关于如何解决设置协调器oozie中文件不存在的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆