Hadoop:从DistributedCache获取文件时,FileNotFoundExcepion [英] Hadoop: FileNotFoundExcepion when getting file from DistributedCache

查看:266
本文介绍了Hadoop:从DistributedCache获取文件时,FileNotFoundExcepion的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个节点集群(v1.04),分别是主节点和从节点.在主服务器上,在Tool.run()中,我们使用addCacheFile()将两个文件添加到DistributedCache中. HDFS中确实存在文件. 在Mapper.setup()中,我们要使用

I’ve 2 nodes cluster (v1.04), master and slave. On the master, in Tool.run() we add two files to the DistributedCache using addCacheFile(). Files do exist in HDFS. In the Mapper.setup() we want to retrieve those files from the cache using

FSDataInputStream fs = FileSystem.get( context.getConfiguration() ).open( path ). 

问题在于,尽管该文件存在于从属节点上,但对于一个文件会抛出FileNotFoundException:

The problem is that for one file a FileNotFoundException is thrown, although the file exists on the slave node:

attempt_201211211227_0020_m_000000_2: java.io.FileNotFoundException: File does not exist: /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/analytics/1.csv

ls –l在从属服务器上:

ls –l on the slave:

[hduser@slave ~]$ ll /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv                        
-rwxr-xr-x 1 hduser hadoop 42701 Nov 22 10:18 /somedir/hdp.tmp.dir/mapred/local/taskTracker/distcache/-7769715304990780/master/tmp/ analytics/1.csv

我的问题是:

  1. 不是所有文件都存在于所有节点上吗?
  2. 该如何解决?

谢谢.

推荐答案

已解决-应该使用beed:

Solved - should have beed used:

FileSystem.getLocal( conf ) 

感谢Hadoop邮件列表中的Harsh J.

Thanks to Harsh J from Hadoop mailing list.

这篇关于Hadoop:从DistributedCache获取文件时,FileNotFoundExcepion的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆