Sqoop导出Oozie工作流失败,未找到文件,从控制台运行时运行 [英] Sqoop Export Oozie Workflow Fails with File Not Found, Works when ran from the console
问题描述
SQLHOST = sqlservermaster.local
SQLDBNAME = db1
HIVEDBNAME = db1
BATCHID =
USERNAME =sqlusername
PASSWORD =password
sqoop export --connect'jdbc:sqlserver://'$ SQLHOST'; username ='$ USERNAME'; password ='$ PASSWORD'; database ='$ SQLDBNAME'--table ExportFromHive --columns col1,col2,col3 --export-dir / apps / hive / warehouse / $ HIVEDBNAME .db / hivetablename
当我从oozie工作流运行这个命令并且传递了相同的参数时,我收到错误信息(当从调度程序屏幕中挖掘实际的作业运行日志时):
** 2015-10- 01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl:作业初始化失败
org.apache.hadoop.yarn.exceptions.YarnRuntimeException:java.io .FileNotFoundException:文件不存在:hdfs:// hadoopnode1:8020 / user / root / .staging / job_1443713197941_0134 / j ob.splitmetainfo
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.createSplits(JobImpl.java:1568)
在org.apache.hadoop.mapreduce.v2 .app.job.impl.JobImpl $ InitTransition.transition(JobImpl.java:1432)
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.transition(JobImpl.java: 1390)
at org.apache.hadoop.yarn.state.StateMachineFactory $ MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory。
at org.apache.hadoop.yarn.state.StateMachineFactory.access $ 300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory $ InternalStateMachine.doTransition (StateMachineFactory.java:448)
在org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
在org.apache.hadoop.mapreduce .v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache .hadoop.mapreduce.v2.app.MRAppMaster $ JobEventDispatcher.handle(MRAppMaster.java:1312)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster $ 4.run(MRAppMaster.java:1519)
。在java.security.AccessController.doPrivileged(本机方法)在javax.security.auth.Subject.doAs(Subject.java:422)
在org.apache.hadoop.security.UserGroupInformation .doAs(UserGroupInformation.java:1657)
在org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
在org.apache.hadoop.mapreduce.v2 .app.MRAppMaster.main(MRAppMaster.java:1448)
导致:java.io.FileNotFoundException:文件不存在:hdfs:// hadoopnode1:8020 / user / root / .staging / job_1443713197941_0134 / job。 splitmetainfo
在org.apache.hadoop.hdfs.DistributedFileSystem $ 22.doCall(Dist在org.apache.hadoop.fs.FileSystemLinkResolver.resolve处使用
(FileSystemLinkResolver.java) :81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51 )
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl $ InitTransition.createSplits(JobImpl.java:1563)
... 17 more **
有没有人见过这个,并且能够排除故障?它只发生在oozie工作流程中。有类似的话题,但似乎没有人解决这个具体问题。
谢谢!
将oozie工作流程的job.properties文件中的user.name属性设置为用户纱线。
user.name = yarn
我认为问题在于它没有权限在/ user / root下创建登台文件。一旦我将正在运行的用户修改为纱线,登台文件就创建在具有适当权限的/ user / yarn下。
I have a hadoop cluster with 6 nodes. I'm pulling data out of MSSQL and back into MSSQL via Sqoop. Sqoop import commands work fine, and I can run a sqoop export command from the console (on one of the hadoop nodes). Here's the shell script I run:
SQLHOST=sqlservermaster.local
SQLDBNAME=db1
HIVEDBNAME=db1
BATCHID=
USERNAME="sqlusername"
PASSWORD="password"
sqoop export --connect 'jdbc:sqlserver://'$SQLHOST';username='$USERNAME';password='$PASSWORD';database='$SQLDBNAME'' --table ExportFromHive --columns col1,col2,col3 --export-dir /apps/hive/warehouse/$HIVEDBNAME.db/hivetablename
When I run this command from an oozie workflow, and it's passed the same parameters, I receive the error (when digging into the actual job run logs from the yarn scheduler screen):
**2015-10-01 20:55:31,084 WARN [main] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Job init failed
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1568)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1432)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.transition(JobImpl.java:1390)
at org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:996)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl.handle(JobImpl.java:138)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobEventDispatcher.handle(MRAppMaster.java:1312)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1080)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$4.run(MRAppMaster.java:1519)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1515)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1448)
Caused by: java.io.FileNotFoundException: File does not exist: hdfs://hadoopnode1:8020/user/root/.staging/job_1443713197941_0134/job.splitmetainfo
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301)
at org.apache.hadoop.mapreduce.split.SplitMetaInfoReader.readSplitMetaInfo(SplitMetaInfoReader.java:51)
at org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl$InitTransition.createSplits(JobImpl.java:1563)
... 17 more**
Has anyone ever seen this and been able to troubleshoot it? It only happens from the oozie workflow. There are similar topics but no one seems to have solved this specific problem.
Thanks!
I was able to solve this problem by setting the user.name property on the job.properties file for the oozie workflow to the user yarn.
user.name=yarn
I think the problem was it did not have permission to create the staging files under /user/root. Once I modified the running user to yarn, the staging files were created under /user/yarn which did have the proper permission.
这篇关于Sqoop导出Oozie工作流失败,未找到文件,从控制台运行时运行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!