Hadoop文件中的分布式缓存未找到异常 [英] Distributed Caching in Hadoop File Not Found Exception

查看:345
本文介绍了Hadoop文件中的分布式缓存未找到异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

它显示它创建了缓存文件。但是,当我去看看文件不存在的位置,当我尝试从我的映射器读取时,它显示File Not Found Exception(文件未找到异常)。



这是我要运行的代码:

  JobConf conf2 = new JobConf(getConf(),CorpusCalculator.class); 
conf2.setJobName(CorpusCalculator2);

//由reducer2发布的文件的分布式缓存在这里完成
conf2.addResource(新路径(/ opt / hadoop1 / conf / core-site.xml));
conf2.addResource(new Path(/ opt / hadoop1 / conf / hdfs-site.xml));

// cacheFile(conf2,new Path(outputPathofReducer2));

conf2.setNumReduceTasks(1);
//conf2.setOutputKeyComparatorClass()

conf2.setMapOutputKeyClass(FloatWritable.class);
conf2.setMapOutputValueClass(Text.class);


conf2.setOutputKeyClass(Text.class);
conf2.setOutputValueClass(Text.class);

conf2.setMapperClass(MapClass2.class);
conf2.setReducerClass(Reduce2.class);



FileInputFormat.setInputPaths(conf2,new Path(inputPathForMapper1));
FileOutputFormat.setOutputPath(conf2,new Path(outputPathofReducer3));

DistributedCache.addCacheFile(new Path(/ sunilFiles / M51.txt)。toUri(),conf2);
JobClient.runJob(conf

日志:



pre> 13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager:在/ tmp1 / mapred / local / archive / -1731849462204707023_-2090562221_1263420527中创建M51.txt / localhost / sunilFiles-work-2204204368663038938 with rwxr-xr-x

13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager:Cached /sunilFiles/M51.txt as / tmp1 / mapred / local / archive /-1731849462204707023_2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager:缓存/sunilFiles/M51.txt as / tmp1 / mapred / local /archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43:40 INFO mapred.JobClient:正在运行的工作:job_local_0003

13/04/27 04:43:40 INFO mapred.Task:Using ResourceCalculatorPlugin:o
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@8c2df1

13/04/27 04:43 :40 INFO mapred.MapT问:numReduceTasks:1

13/04/27 04:43:40 INFO mapred.MapTask:io.sort.mb = 100

13/04/27 04: 43:40 INFO mapred.MapTask:data buffer = 79691776/99614720

13/04/27 04:43:40 INFO mapred.MapTask:record buffer = 262144/327680

里面 configure()

 异常读取DistribtuedCache:java.io.FileNotFoundException:/tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt(是一个目录) 

Inside setup():/tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43 :41 WARN mapred.LocalJobRunner:job_local_0003

请帮助我,我一直在寻找解决方案持续6小时,明天我有作业提交。非常感谢你。

解决方案

我通过使用copyMerge()属性解决了这个问题,它将所有存在的文件机器成一个单一的文件,该文件我已经成功地使用...如果我正在使用正常的文件它是失败的。谢谢你的回复。


It shows that it created cached files. But, when I go and look at the location the file is not present and when I am trying to read from my mapper it shows the File Not Found Exception.

This is the code that I am trying to run:

    JobConf conf2 = new JobConf(getConf(), CorpusCalculator.class);
    conf2.setJobName("CorpusCalculator2");

    //Distributed Caching of the file emitted by the reducer2 is done here
    conf2.addResource(new Path("/opt/hadoop1/conf/core-site.xml"));
    conf2.addResource(new Path("/opt/hadoop1/conf/hdfs-site.xml"));

    //cacheFile(conf2, new Path(outputPathofReducer2));

    conf2.setNumReduceTasks(1);
    //conf2.setOutputKeyComparatorClass()

    conf2.setMapOutputKeyClass(FloatWritable.class);
    conf2.setMapOutputValueClass(Text.class);


    conf2.setOutputKeyClass(Text.class);
    conf2.setOutputValueClass(Text.class);

    conf2.setMapperClass(MapClass2.class);
    conf2.setReducerClass(Reduce2.class);



    FileInputFormat.setInputPaths(conf2, new Path(inputPathForMapper1));
    FileOutputFormat.setOutputPath(conf2, new Path(outputPathofReducer3));

    DistributedCache.addCacheFile(new Path("/sunilFiles/M51.txt").toUri(),conf2);
    JobClient.runJob(conf

Logs:

13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Creating M51.txt in /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles-work-2204204368663038938 with rwxr-xr-x

13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Cached /sunilFiles/M51.txt as /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43:40 INFO filecache.TrackerDistributedCacheManager: Cached /sunilFiles/M51.txt as /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43:40 INFO mapred.JobClient: Running job: job_local_0003

13/04/27 04:43:40 INFO mapred.Task:  Using ResourceCalculatorPlugin : o
org.apache.hadoop.util.LinuxResourceCalculatorPlugin@8c2df1

13/04/27 04:43:40 INFO mapred.MapTask: numReduceTasks: 1

13/04/27 04:43:40 INFO mapred.MapTask: io.sort.mb = 100

13/04/27 04:43:40 INFO mapred.MapTask: data buffer = 79691776/99614720

13/04/27 04:43:40 INFO mapred.MapTask: record buffer = 262144/327680

inside configure():

Exception reading DistribtuedCache: java.io.FileNotFoundException: /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt (Is a directory)

Inside setup(): /tmp1/mapred/local/archive/-1731849462204707023_-2090562221_1263420527/localhost/sunilFiles/M51.txt

13/04/27 04:43:41 WARN mapred.LocalJobRunner: job_local_0003

Please help me out, I have been searching solution for this for last 6 hours continuously and tomorrow I have an assignment submission. Thank you very much.

解决方案

I solved this problem by using copyMerge() Property which merges all the files that are present in various machines into a single file and that file I was successfully able to use..if I am using normal file it is failing. thanks for your replies guys.

这篇关于Hadoop文件中的分布式缓存未找到异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆