Apache Nutch错误:注入器:java.io.IOException:命令字符串中的(null)条目:null chmod 0644 [英] Apache Nutch error: Injector: java.io.IOException: (null) entry in command string: null chmod 0644

查看:148
本文介绍了Apache Nutch错误:注入器:java.io.IOException:命令字符串中的(null)条目:null chmod 0644的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在具有Java 1.8的Windows 10上使用Apache Nutch 1.14.我已经按照 https://wiki.apache.org/nutch/NutchTutorial.

I am using Apache Nutch 1.14 on Windows 10 having java 1.8. I have followed the same steps as mentioned on https://wiki.apache.org/nutch/NutchTutorial.

当我尝试使用cygwin上的命令在crawldb中注入URL时:bin/nutch注入crawl/crawldb URL

When I try to inject the URLs in crawldb using the command on cygwin : bin/nutch inject crawl/crawldb urls

我收到以下错误: 注入器:java.io.IOException:命令字符串中的(null)项:null chmod 0644 E:\ apache-nutch-1.4 \ runtime \ local \ crawl \ crawldb.locked 在org.apache.hadoop.util.Shell $ ShellCommandExecutor.execute(Shell.java:773)

I get the following error: Injector: java.io.IOException: (null) entry in command string: null chmod 0644 E:\apache-nutch-1.4\runtime\local\crawl\crawldb.locked at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:773)

我检查了日志,发现了这一点:

I checked the logs and found this:

2018-01-18 10:55:26,785错误util.Shell-无法在hadoop二进制路径中找到winutils二进制文件 java.io.IOException:无法在Hadoop二进制文件中找到可执行文件null \ bin \ winutils.exe.

2018-01-18 10:55:26,785 ERROR util.Shell - Failed to locate the winutils binary in the hadoop binary path java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

我已经在多个页面上搜索了此错误,但是没有帮助.

I have searched for this error on several pages but none was of help.

推荐答案

  1. 在Windows中创建新目录,例如c:\ winutil.
  2. 在winutil内部创建bin目录
  3. 打开 https://minhaskamal.github.io/DownGit/#/home
  4. 粘贴 https://github.com/steveloughran/winutils/上述网站中的tree/master/hadoop-2.8.1 ,然后下载winutil-hadoop2.8.1
  5. 提取c:\ winutil \ bin中的zip内容
  6. 将HADOOP_HOME变量添加到您的系统变量中,并使其引用c:\ winutil
  7. 在cygin中重新运行您的抓取命令
  1. make new directory in windows e.g c:\winutil.
  2. inside winutil create bin directory
  3. open https://minhaskamal.github.io/DownGit/#/home
  4. paste https://github.com/steveloughran/winutils/tree/master/hadoop-2.8.1 in above website, and download the winutil-hadoop2.8.1
  5. extract the zip content in c:\winutil\bin
  6. add HADOOP_HOME variable to your system variable and make it refer to c:\winutil
  7. re-run your crawl command in cygin

这篇关于Apache Nutch错误:注入器:java.io.IOException:命令字符串中的(null)条目:null chmod 0644的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆