Windows中的Nutch:无法设置路径的权限 [英] Nutch in Windows: Failed to set permissions of path

查看:260
本文介绍了Windows中的Nutch:无法设置路径的权限的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在Windows机器上使用Nutch的Solr,并得到以下错误:

 异常in threadmainjava.io.IOException:无法设置路径的权限:c:\temp\mapred\staging\admin-1654213299\.staging to 0700 

从我学到的很多线程中,hadoop似乎被nutch使用了一些chmod魔术,将在unix机器上工作,但不是



这个问题存在一年多了。我发现一个线程,其中代码行显示和修复提议。我真的他们只有一个有这个问题吗?所有其他人创建自定义构建,以便在窗口上运行nutch?或者有一些选项来禁用hadoop的东西或另一个解决方案?也许是另一个爬虫比nutch?



非常感谢。
Boris



这里是我在做什么的堆栈跟踪....

  admin @ WIN-G1BPD00JH42 /cygdrive/c/solr/apache-nutch-1.6 
$ bin / nutch crawl urls -dir crawl -depth 3 -topN 5 -solr http:// localhost:8080 / solr-4.1.0
cygpath:无法转换空路径
抓取开始于:crawl
rootUrlDir = urls
threads = 10
depth = 3
solrUrl = http:// localhost:8080 / solr-4.1.0
topN = 5
进样器:从2013-03-03开始17:43:15
进样器:crawlDb:crawl / crawldb
Injector:urlDir:urls
Injector:将注入的网址转换为抓取数据库条目。
线程main中的异常java.io.IOException:无法设置路径的权限:c:\temp\mapred\staging\admin-1654213299\.staging to 0700
org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
在org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
在org.apache。 hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
在org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
在org.apache.hadoop.fs。 FilterFileSystem.mkdirs(FilterFileSystem.java:189)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
at org.apache.hadoop.mapred.JobClient $ 2.run (JobClient.java:856)
at org.apache.hadoop.mapred.JobClient $ 2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
在javax.security.auth.Subject.doAs(未知源)
在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
在org.apache.hadoop.mapred .JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.mapred.JobClient.runJob (JobClient.java:1261)
at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
at org.apache.nutch.crawl.Crawl.run(Crawl.java :127)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)


解决方案

这需要我一段时间才能完成,


  1. 下载 hadoop-core /。> / lib> /hadoop-core/0.20.2rel =nofollow> Hadoop Core 0.20.2
  2. core-1.2.0.jar,下载的文件以相同的名称重命名。



说明



这个问题是由hadoop引起的,因为它假设你运行在unix上,遵守文件许可规则。这个问题在2011年解决了,但nutch没有更新他们使用的hadoop版本。相关修订是此处这里


I'm trying to user Solr with Nutch on a Windows Machine and I'm getting the following error:

Exception in thread "main" java.io.IOException: Failed to set permissions of path: c:\temp\mapred\staging\admin-1654213299\.staging to 0700

From a lot of threads I learned, that hadoop wich seems to be used by nutch does some chmod magic that will work on unix machines, but not on windows.

This problem exists for more than a year now. I found one thread, where the code line is shown and a fix proposed. Am I really them only one who has this problem? Are all others creating a custom build in order to run nutch on windows? Or is there some option to disable the hadoop stuff or another solution? Maybe another crawler than nutch?

Thanks a lot. Boris

Here's the stack trace of what I'm doing....

    admin@WIN-G1BPD00JH42 /cygdrive/c/solr/apache-nutch-1.6
    $ bin/nutch crawl urls -dir crawl -depth 3 -topN 5 -solr http://localhost:8080/solr-4.1.0
    cygpath: can't convert empty path
    crawl started in: crawl
    rootUrlDir = urls
    threads = 10
    depth = 3
    solrUrl=http://localhost:8080/solr-4.1.0
    topN = 5
    Injector: starting at 2013-03-03 17:43:15
    Injector: crawlDb: crawl/crawldb
    Injector: urlDir: urls
    Injector: Converting injected urls to crawl db entries.
    Exception in thread "main" java.io.IOException: Failed to set permissions of path:         c:\temp\mapred\staging\admin-1654213299\.staging to 0700
        at org.apache.hadoop.fs.FileUtil.checkReturnValue(FileUtil.java:689)
        at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:662)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
        at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
        at org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:189)
        at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:116)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
        at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Unknown Source)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
        at org.apache.nutch.crawl.Injector.inject(Injector.java:281)
        at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

解决方案

It took me a while to get this working but here's the solution which works on nutch 1.7.

  1. Download Hadoop Core 0.20.2 from the MVN repository
  2. Replace (nutch-directory)/lib/hadoop-core-1.2.0.jar with the downloaded file renaming it with the same name.

That should be it.

Explanation

This issue is caused by hadoop since it assumes you're running on unix and abides by the file permission rules. The issue was resolved in 2011 actually but nutch didn't update the hadoop version they use. The relevant fixes are here and here

这篇关于Windows中的Nutch:无法设置路径的权限的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆