Sqoop导出Oozie工作流失败,未找到文件,从控制台运行时运行 [英] Sqoop Export Oozie Workflow Fails with File Not Found, Works when ran from the console

查看:271
本文介绍了Sqoop导出Oozie工作流失败,未找到文件,从控制台运行时运行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个6节点的hadoop集群。我将数据从MSSQL中提取出来,然后通过Sqoop返回到MSSQL中。 Sqoop导入命令正常工作,并且我可以从控制台(在其中一个hadoop节点上)运行sqoop export命令。下面是我运行的shell脚本:

  SQLHOST = sqlservermaster.local 
SQLDBNAME = db1
HIVEDBNAME = db1
BATCHID =
USERNAME =sqlusername
PASSWORD =password


sqoop export --connect'jdbc:sqlserver://'$ SQLHOST'; username ='$ USERNAME'; password ='$ PASSWORD'; database ='$ SQLDBNAME'--table ExportFromHive --columns col1,col2,col3 --export-dir / apps / hive / warehouse / $ HIVEDBNAME .db / hivetablename

当我从oozie工作流运行这个命令并且传递了相同的参数时,我收到错误信息(当从调度程序屏幕中挖掘实际的作业运行日志时):

  ** 2015-10- 01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:作业初始化失败
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:java.io .FileNotFoundException:文件不存在:hdfs:// hadoopnode1:8020 / user / root / .staging / job_1443713197941_0134 / j ob.splitmetainfo
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.createSplits(JobImpl.java:1568)
在org.apache.hadoop.mapreduce.v2 .app.job.impl.JobImpl $ InitTransition.transition(JobImpl.java:1432)
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.transition(JobImpl.java: 1390)
at org.apache.hadoop.yarn.state.StateMachineFactory $ MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory。
at org.apache.hadoop.yarn.state.StateMachineFactory.access $ 300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory $ InternalStateMachine.doTransition (StateMachineFactory.java:448)
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
在org.apache.hadoop.mapreduce .v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache .hadoop.mapreduce.v2.app.MRAppMaster $ JobEventDispatcher.handle(MRAppMaster.java:1312)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster $ 4.run(MRAppMaster.java:1519)
。在java.security.AccessController.doPrivileged(本机方法)在javax.security.auth.Subject.doAs(Subject.java:422)

在org.apache.hadoop.security.UserGroupInformation .doAs(UserGroupInformation.java:1657)
在org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
在org.apache.hadoop.mapreduce.v2 .app.MRAppMaster.main(MRAppMaster.java:1448)
导致:java.io.FileNotFoundException:文件不存在:hdfs:// hadoopnode1:8020 / user / root / .staging / job_1443713197941_0134 / job。 splitmetainfo
在org.apache.hadoop.hdfs.DistributedFileSystem $ 22.doCall(Dist在org.apache.hadoop.fs.FileSystemLinkResolver.resolve处使用
(FileSystemLinkResolver.java) :81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51 )
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.createSplits(JobImpl.java:1563)
... 17 more **

有没有人见过这个,并且能够排除故障?它只发生在oozie工作流程中。有类似的话题,但似乎没有人解决这个具体问题。



谢谢!

解决方案

将oozie工作流程的job.properties文件中的user.name属性设置为用户纱线。

  user.name = yarn 

我认为问题在于它没有权限在/ user / root下创建登台文件。一旦我将正在运行的用户修改为纱线,登台文件就创建在具有适当权限的/ user / yarn下。

I have a hadoop cluster with 6 nodes. I'm pulling data out of MSSQL and back into MSSQL via Sqoop. Sqoop import commands work fine, and I can run a sqoop export command from the console (on one of the hadoop nodes). Here's the shell script I run:

SQLHOST=sqlservermaster.local
SQLDBNAME=db1
HIVEDBNAME=db1
BATCHID=
USERNAME="sqlusername"
PASSWORD="password"


sqoop export --connect 'jdbc:sqlserver://'$SQLHOST';username='$USERNAME';password='$PASSWORD';database='$SQLDBNAME'' --table ExportFromHive --columns col1,col2,col3 --export-dir /apps/hive/warehouse/$HIVEDBNAME.db/hivetablename    

When I run this command from an oozie workflow, and it's passed the same parameters, I receive the error (when digging into the actual job run logs from the yarn scheduler screen):

**2015-10-01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job init failed
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
    at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
    at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
    at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1312)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
    at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1519)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
    at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
    at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
    at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
    at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563)
    ... 17 more**

Has anyone ever seen this and been able to troubleshoot it? It only happens from the oozie workflow. There are similar topics but no one seems to have solved this specific problem.

Thanks!

解决方案

I was able to solve this problem by setting the user.name property on the job.properties file for the oozie workflow to the user yarn.

user.name=yarn

I think the problem was it did not have permission to create the staging files under /user/root. Once I modified the running user to yarn, the staging files were created under /user/yarn which did have the proper permission.

这篇关于Sqoop导出Oozie工作流失败,未找到文件,从控制台运行时运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆