Sqoop导出错误 - 原因:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在 [英] Sqoop export error - cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist

查看:6350
本文介绍了Sqoop导出错误 - 原因:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个java程序。



java程序将数据从hive导出到mysql。



首先,我写代码

  ProcessBuilder pb = new ProcessBuilder(sqoop-export,export,
--connect,jdbc:mysql:// localhost / mydb,
--hadoop-home,/home/yoonhok/development/hadoop-1.1.1,
- table,mytable,
--export-dir,/ user / hive / warehouse / tbl_2,
--username,yoonhok,
- 密码,1234);

try {
进程p = pb.start();
if(p.waitFor()!= 0){
System.out.println(Error:sqoop-export failed。);
返回false;
}
} catch(IOException e){
e.printStackTrace();
} catch(InterruptedException e){
e.printStackTrace();
}

它的效果很好。



Sqoop不支持客户端api。



所以我添加了sqoop lib,只是写了Sqoop.run()



其次,我用新的方式再次写代码。

  String [] str = {export,
--connect,jdbc:mysql:// localhost / mydb,
--hadoop-home,/home/yoonhok/development/hadoop-1.1.1,
--table,mytable,
--export-dir ,/ user / hive / warehouse / tbl_2,
--username,yoonhok,
--password,1234
};

if(Sqoop.runTool(str)== 1){
System.out.println(Error:sqoop-export failed。);
返回false;
}

但它没有运行。



我有错误...

  13/02/14 16:17:09 WARN tool.BaseSqoopTool:在命令行中设置密码是不安全的。考虑使用-P代替。 
13/02/14 17:43:12 WARN sqoop.ConnFactory:$ SQOOP_CONF_DIR尚未在环境中设置。无法检查其他配置。
13/02/14 16:17:09 INFO manager.MySQLManager:准备使用MySQL流结果集。
13/02/14 16:17:09 INFO tool.CodeGenTool:开始代码生成
13/02/14 16:17:09 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM `tbl_2` AS t LIMIT 1
13/02/14 16:17:09 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM`tbl_2`作为t LIMIT 1
13/02 / 14 16:17:09 INFO orm.CompilationManager:HADOOP_HOME是/home/yoonhok/development/hadoop-1.1.1
注意:/tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java使用或覆盖已弃用的API。
注意:使用-Xlint:deprecation重新编译以获取详细信息。
13/02/14 16:17:10 INFO orm.CompilationManager:编写jar文件:/tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar
13/02/14 16:17: 10 INFO mapreduce.ExportJobBase:开始导出tbl_2
13/02/14 16:17:10 WARN mapreduce.ExportJobBase:输入路径文件:/ user / hive / warehouse / tbl_2不存在
13 / 02/14 16:17:11 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库...在适用的情况下使用builtin-java类
13/02/14 16:17:11 INFO mapred。 JobClient:清理暂存区文件:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001
13/02/14 16:17:11 ERROR security.UserGroupInformation:PriviledgedActionException as:yoonhok cause :org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在:file:/ user / hive / warehouse / tbl_2
13/02/14 16:17:11 ERROR tool.ExportTool:遇到IOException运行导出作业:org.apache.hadoop.mapreduce.lib.input.Inv alidInputException:输入路径不存在:file:/ user / hive / warehouse / tbl_2

code> $ SQOOP_CONF_DIR尚未在环境中设置。



所以我添加了


SQOOP_CONF_DIR = / home / yoonhok / development / sqoop-1.4.2.bin__hadoop-1.0.0 / conf





/ etc / environment


然后再试一次,但错误...

  13/02/14 16:17: 09 WARN tool.BaseSqoopTool:在命令行中设置密码是不安全的。考虑使用-P代替。 
13/02/14 16:17:09 INFO manager.MySQLManager:准备使用MySQL流结果集。
13/02/14 16:17:09 INFO tool.CodeGenTool:开始代码生成
13/02/14 16:17:09 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM `tbl_2` AS t LIMIT 1
13/02/14 16:17:09 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM`tbl_2`作为t LIMIT 1
13/02 / 14 16:17:09 INFO orm.CompilationManager:HADOOP_HOME是/home/yoonhok/development/hadoop-1.1.1
注意:/tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java使用或覆盖已弃用的API。
注意:使用-Xlint:deprecation重新编译以获取详细信息。
13/02/14 16:17:10 INFO orm.CompilationManager:编写jar文件:/tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar
13/02/14 16:17: 10 INFO mapreduce.ExportJobBase:开始导出tbl_2
13/02/14 16:17:10 WARN mapreduce.ExportJobBase:输入路径文件:/ user / hive / warehouse / tbl_2不存在
13 / 02/14 16:17:11 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库...在适用的情况下使用builtin-java类
13/02/14 16:17:11 INFO mapred。 JobClient:清理暂存区文件:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001
13/02/14 16:17:11 ERROR security.UserGroupInformation:PriviledgedActionException as:yoonhok cause :org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在:file:/ user / hive / warehouse / tbl_2
13/02/14 16:17:11 ERROR tool.ExportTool:遇到IOException运行导出作业:org.apache.hadoop.mapreduce.lib.input.Inv alidInputException:输入路径不存在:file:/ user / hive / warehouse / tbl_2

我认为导出目录是问题。



我使用/ user / hive / warehouse / tbl_2。



当我运行hadoop fs -ls / user / hive / warehouse /时,表tbl_2存在。



我认为



输入路径不存在:文件:/ user / hive / warehouse / tbl_2不行。



输入路径不存在: hdfs :/ user / hive / warehouse / tbl_2可以。



但是我不知道如何解决这个问题。






$ b

我编辑'export-dir'

   -  export-dir hdfs :// localhost:9000 / user / hive / warehouse / tbl_2 

但是...这是错误。 .. TT

  13/02/15 15:17:20 WARN tool.BaseSqoopTool:在命令行中设置密码是不安全的考虑使用-P代替。 
13/02/15 15:17:20 INFO manager.MySQLManager:准备使用MySQL流结果集。
13/02/15 15:17:20 INFO tool.CodeGenTool:开始代码生成
13/02/15 15:17:20 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO manager.SqlManager:执行SQL语句:SELECT t。* FROM`tbl_2` AS t LIMIT 1
13/02 / 15 15:17:20 INFO orm.CompilationManager:HADOOP_HOME是/home/yoonhok/development/hadoop-1.1.1/libexec/ ..
注意:/tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.java使用或覆盖已弃用的API。
注意:使用-Xlint:deprecation重新编译以获取详细信息。
13/02/15 15:17:22 INFO orm.CompilationManager:编写jar文件:/tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.jar
13/02/15 15:17: 22 INFO mapreduce.ExportJobBase:tbl_2的开始导出
13/02/15 15:17:22 WARN util.NativeCodeLoader:无法为您的平台加载native-hadoop库...在适用的情况下使用builtin-java类
13/02/15 15:17:23 INFO input.FileInputFormat:要处理的总输入路径:1
13/02/15 15:17:23 INFO input.FileInputFormat:要处理的总输入路径: 1
13/02/15 15:17:23 INFO mapred.JobClient:清理分段区域文件:/tmp/hadoop-yoonhok/mapred/staging/yoonhok922915382/.staging/job_local_0001
13 / 02/15 15:17:23 ERROR security.UserGroupInformation:PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException:File / user / hive / warehouse / tbl_2 / 000000_0不存在。
13/02/15 15:17:23 ERROR tool.ExportTool:遇到IOException运行导出作业:java.io.FileNotFoundException:File / user / hive / warehouse / tbl_2 / 000000_0不存在。

当我检查hdfs时,

  hadoop fs -ls / user / hive / warehouse / tbl_2 

  hadoop fs -ls hdfs:// localhost:9000 / user / hive / warehouse / tbl_2 

该文件存在。


-rw -r - r-- 1 yoonhok supergroup 14029022 2013-02-15 12:16 / user / hive / warehouse / tbl_2 / 000000_0


我尝试在终端中的shell命令

  sqoop-export --connect jdbc:mysql:// localhost / detector  - -table tbl_2 --export-dir hdfs:// localhost:9000 / user / hive / warehouse / tbl_2 --username yoonhok --password 1234 

这是工作。



有什么问题?



我不你可以帮我吗?

解决方案

你需要加载并提供您的Hadoop配置文件。默认情况下,它们是从类路径读取的,但您可以通过 Configuration.setDefaultResource (无保证)。


I am developing a java program.

The java program exports data from hive to mysql.

First, I write the code

ProcessBuilder pb = new ProcessBuilder("sqoop-export", "export", 
         "--connect",               "jdbc:mysql://localhost/mydb", 
         "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
         "--table",                    "mytable", 
         "--export-dir",            "/user/hive/warehouse/tbl_2", 
         "--username",            "yoonhok", 
         "--password",            "1234");

try {
    Process p = pb.start();
    if (p.waitFor() != 0) {
        System.out.println("Error: sqoop-export failed.");
        return false;
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
}

It works perfectly.

But I learned a new way of using sqoop in java.

Sqoop doesn't support client api yet.

So I added sqoop lib and just write Sqoop.run()

Second, I write the code again with new way.

String[] str = {"export", 
     "--connect",               "jdbc:mysql://localhost/mydb", 
     "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
     "--table",                    "mytable", 
     "--export-dir",            "/user/hive/warehouse/tbl_2", 
     "--username",            "yoonhok", 
     "--password",            "1234"
};

if (Sqoop.runTool(str) == 1) {
     System.out.println("Error: sqoop-export failed.");
     return false;
}

But It doesn't running.

I got error......

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 17:43:12 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

I saw $SQOOP_CONF_DIR has not been set in the environment.

so I added

SQOOP_CONF_DIR=/home/yoonhok/development/sqoop-1.4.2.bin__hadoop-1.0.0/conf

in the

/etc/environment

And try again, but Error...

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

I think that Export-dir is problem.

I use "/user/hive/warehouse/tbl_2".

And When I run "hadoop fs -ls /user/hive/warehouse/", the table "tbl_2" exist.

I think that

"Input path does not exist: file:/user/hive/warehouse/tbl_2" is not ok.

"Input path does not exist: hdfs:/user/hive/warehouse/tbl_2" is ok.

But I don't know how can I fix it.


Ok just before I got a hint.

And I edited 'export-dir'

--export-dir   hdfs://localhost:9000/user/hive/warehouse/tbl_2

But... It's error... T.T

13/02/15 15:17:20 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/02/15 15:17:20 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
13/02/15 15:17:20 INFO tool.CodeGenTool: Beginning code generation
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1/libexec/..
Note: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/02/15 15:17:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.jar
13/02/15 15:17:22 INFO mapreduce.ExportJobBase: Beginning export of tbl_2
13/02/15 15:17:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok922915382/.staging/job_local_0001
13/02/15 15:17:23 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.
13/02/15 15:17:23 ERROR tool.ExportTool: Encountered IOException running export job: java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.

When I checked hdfs,

hadoop fs -ls /user/hive/warehouse/tbl_2

or

hadoop fs -ls hdfs://localhost:9000/user/hive/warehouse/tbl_2

the file exist.

-rw-r--r-- 1 yoonhok supergroup 14029022 2013-02-15 12:16 /user/hive/warehouse/tbl_2/000000_0

I try in the shell command in terminal

sqoop-export --connect jdbc:mysql://localhost/detector --table tbl_2 --export-dir hdfs://localhost:9000/user/hive/warehouse/tbl_2 --username yoonhok --password 1234

It's work.

What's problem?

I don't know.

Could you help me?

解决方案

You need to load and provide your Hadoop configuration files. By default they are read from classpath, but you might be able to override this by Configuration.setDefaultResource (without guarantees).

这篇关于Sqoop导出错误 - 原因:org.apache.hadoop.mapreduce.lib.input.InvalidInputException:输入路径不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆