是否可以在没有 Cygwin 的 Windows 上以本地模式运行 Hadoop 作业(如 WordCount 示例)? [英] Is it possible to run Hadoop jobs (like the WordCount sample) in the local mode on Windows without Cygwin?

查看:20
本文介绍了是否可以在没有 Cygwin 的 Windows 上以本地模式运行 Hadoop 作业(如 WordCount 示例)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有 Windows 7、Java 8、Maven 和 Eclipse.我创建了一个 Maven 项目,并使用了与此处几乎完全相同的代码.

I have Windows 7, Java 8, Maven and Eclipse. I've created a Maven project and used almost exactly the same code as here.

这只是一个简单的字数统计"示例.我尝试从 Eclipse 启动驱动程序"程序,我提供命令行参数(输入文件和输出目录)并收到以下错误:

It's just a simple "word count" sample. I try to launch the "driver" program from Eclipse, I provide command line arguments (the input file and the output directory) and get the following error:

Exception in thread "main" java.lang.NullPointerException   at
java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)    at
org.apache.hadoop.util.Shell.runCommand(Shell.java:404)     at
org.apache.hadoop.util.Shell.run(Shell.java:379)    at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) at
org.apache.hadoop.util.Shell.execCommand(Shell.java:678)    at
org.apache.hadoop.util.Shell.execCommand(Shell.java:661)    at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:639) at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:435) at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:277) at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125) at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:344) at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)   at
org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)   at
java.security.AccessController.doPrivileged(Native Method)  at
javax.security.auth.Subject.doAs(Subject.java:422)  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at
org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)   at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286)    at
misc.projects.hadoop.exercises.WordCountDriverApp.main(WordCountDriverApp.java:29)

失败的行 (WordCountDriverApp.java:29) 包含启动作业的命令:

The failing line (WordCountDriverApp.java:29) contains the command to launch the job:

job.waitForCompletion(true)

我想让它工作,因此我想了解一些东西:

I want to make it work and therefore I want to understand something:

如果我只想要本地模式(没有任何集群),我是否必须提供任何 hdfs-site.xml、yarn-site.xml、...所有这些?我现在没有这些 XML 配置文件.据我所知,本地模式的默认值都可以,也许我错了.

Do I have to provide any hdfs-site.xml, yarn-site.xml, ... all this, if I want just the local mode (without any cluster)? I don't have these XML config files now. As far as I remember, the defaults are all OK for the local mode, maybe I am wrong.

是否有可能在 Windows 下(启动任何 Hadoop 作业)或者整个 Hadoop 事物都只适用于 Linux?

Is it possible at all under Windows (to launch any Hadoop jobs whatsoever) or the whole Hadoop thing is Linux-only?

附注:Hadoop 依赖如下:

P.S.: The Hadoop dependency is the following:

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-client</artifactId>
    <version>2.2.0</version>
    <scope>provided</scope>
</dependency>

推荐答案

  1. 下载 Hadoop 2.6.02.7.1 为 Windows 编译
  2. 创建指向解压目录的 HADOOP_HOME 环境变量
  3. 将 %HADOOP_HOME%in 添加到 PATH 环境变量
  1. Download Hadoop 2.6.0 or 2.7.1 compiled for Windows
  2. Create HADOOP_HOME environment variable pointing to the unzipped dir
  3. Add %HADOOP_HOME%in to PATH env var

来源:https://stackoverflow.com/a/27394808/543836

这篇关于是否可以在没有 Cygwin 的 Windows 上以本地模式运行 Hadoop 作业(如 WordCount 示例)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆