在不使用Cygwin的情况下,是否可以在Windows上以本地模式运行Hadoop作业(如WordCount示例)? [英] Is it possible to run Hadoop jobs (like the WordCount sample) in the local mode on Windows without Cygwin?
问题描述
我创建了一个Maven项目,并使用与此处完全相同的代码。
这只是一个简单的字数统计样本。
我尝试从Eclipse启动driver程序,我提供了命令行参数(输入文件和输出目录),并得到以下错误:
<$
java.lang.NullPointerException
java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)at
org.apache.hadoop .util.Shell.runCommand(Shell.java:404)at
org.apache.hadoop.util.Shell.run(Shell.java:379)at
org.apache.hadoop.util.Shell
$ b的ShellCommandExecutor.execute(Shell.java:589)org.apache.hadoop.util.Shell.execCommand(Shell.java:678)
org.apache.hadoop.util.Shell.execCommand (Shell.java:661)
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639)at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java :435)at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:277)at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFi
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:344)at
org.apache.hadoop.mapreduce.Job $ 10.run(Job.java :1268)at
org.apache.hadoop.mapreduce.Job $ 10.run(Job.java:1265)at
javax中的java.security.AccessController.doPrivileged(Native方法)。
中的security.auth.Subject.doAs(Subject.java:422)org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)at
org.apache.hadoop.mapreduce。
中的Job.submit(Job.java:1265)org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)at
misc.projects.hadoop.exercises.WordCountDriverApp.main( WordCountDriverApp.java:29)
失败的行(WordCountDriverApp.java:29)包含要启动的命令工作:
$ p $ job $。$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $我想让它工作,因此我想了解一些事情:
我必须提供任何hdfs-si te.xml,yarn-site.xml,...所有这些,如果我只想要本地模式(没有任何集群)?
我现在没有这些XML配置文件。据我所知,本地模式的默认值都是正常的,也许我错了。
Windows下是否可以使用它(启动任何Hadoop任务)或整个Hadoop的事情是仅限于Linux?
PS:
Hadoop的依赖关系如下:
< dependency>
< groupId> org.apache.hadoop< / groupId>
< artifactId> hadoop-client< / artifactId>
< version> 2.2.0< / version>
< scope>提供< / scope>
< /依赖关系>
来源: https:/ /stackoverflow.com/a/27394808/543836
I have Windows 7, Java 8, Maven and Eclipse. I've created a Maven project and used almost exactly the same code as here.
It's just a simple "word count" sample. I try to launch the "driver" program from Eclipse, I provide command line arguments (the input file and the output directory) and get the following error:
Exception in thread "main" java.lang.NullPointerException at
java.lang.ProcessBuilder.start(ProcessBuilder.java:1012) at
org.apache.hadoop.util.Shell.runCommand(Shell.java:404) at
org.apache.hadoop.util.Shell.run(Shell.java:379) at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:678) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:661) at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:435) at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:277) at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125) at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:344) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:422) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at
org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) at
misc.projects.hadoop.exercises.WordCountDriverApp.main(WordCountDriverApp.java:29)
The failing line (WordCountDriverApp.java:29) contains the command to launch the job:
job.waitForCompletion(true)
I want to make it work and therefore I want to understand something:
Do I have to provide any hdfs-site.xml, yarn-site.xml, ... all this, if I want just the local mode (without any cluster)? I don't have these XML config files now. As far as I remember, the defaults are all OK for the local mode, maybe I am wrong.
Is it possible at all under Windows (to launch any Hadoop jobs whatsoever) or the whole Hadoop thing is Linux-only?
P.S.: The Hadoop dependency is the following:
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
- Download Hadoop 2.6.0 or 2.7.1 compiled for Windows
- Create HADOOP_HOME environment variable pointing to the unzipped dir
- Add %HADOOP_HOME%\bin to PATH env var
Source: https://stackoverflow.com/a/27394808/543836
这篇关于在不使用Cygwin的情况下,是否可以在Windows上以本地模式运行Hadoop作业(如WordCount示例)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!