oozie与配置单元导入的sqoop动作 [英] oozie sqoop action with hive import

查看:172
本文介绍了oozie与配置单元导入的sqoop动作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个sqoop操作,它从postgres数据库中提取数据,然后导入配置单元表。当我执行oozie工作流时,scoop将来自postgres的数据拖入HDFS。但它无法将数据导入配置单元表。日志没有任何用处,因为我只是获取Main类[org.apache.oozie.action.hadoop.SqoopMain],从oozie Web控制台UI中退出代码[1]。我们实际上可以在sqoop动作中进行配置单元导入吗?或者我必须在sqoop导入到HDFS之后单独执行Hive操作?

I have a sqoop action which pulls data from postgres database and then imports into a hive table. When I execute the oozie workflow, scoop pulls the data from postgres into HDFS. But it fails to import data into the hive table. The logs are no way useful as I just get Main class [org.apache.oozie.action.hadoop.SqoopMain], exit code [1] from the oozie web console UI. Can we actually do a hive import inside sqoop action? Or do I have to perform a Hive action separately after sqoop does an import into HDFS?

<action name="ads-sqoop-import">
    <sqoop xmlns="uri:oozie:sqoop-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
                <name>dbIP</name>
                <value>${dbIP}</value>
            </property>
            <property>
                <name>dbPort</name>
                <value>${dbPort}</value>
            </property>
            <property>
                <name>dbUserName</name>
                <value>${dbUserName}</value>
            </property>
            <property>
                <name>dbPassword</name>
                <value>${dbPassword}</value>
            </property>
            <property>
                <name>hive_db_name</name>
                <value>${hive_db_name}</value>
            </property>
            <property>
                <name>scoop_target_dir</name>
                <value>${scoop_target_dir}</value>
            </property>
            <property>
                <name>dbName</name>
                <value>${dbName}</value>
            </property>
        </configuration>
        <command>import --connect jdbc:postgresql://${dbIP}:${dbPort}/${dbName} --username ${dbUserName} --password &quot;${dbPassword}&quot; --table ads --hive-table ${hive_db_name}.ads --create-hive-table --hive-import -m 1 --target-dir ${scoop_target_dir}/ads
        </command>
    </sqoop>
    <ok to="orders-sqoop-import"/>
    <error to="kill"/>
</action>


推荐答案

我必须添加蜂巢网站的位置.xml添加到sqoop操作以使配置单元导入工作。 Oozie需要像Metastore目录等配置单元默认值才能将数据导入配置单元。在全局部分下添加以下代码,或者在需要执行配置单元功能的任何位置添加以下操作。

I had to add the location of the hive-site.xml to the sqoop action to make the hive import work. Oozie needs the hive defaults like the metastore directory etc for it to import data into hive. Add the following code under the global section or with an action wherever you would want to perform hive functions. Copy the hive-site.xml to the HDFS and include it.

<job-xml>hdfs://namenode/hive-site.xml</job-xml>

这篇关于oozie与配置单元导入的sqoop动作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆