Hadoop DistributedCache 已弃用 - 首选 API 是什么? [英] Hadoop DistributedCache is deprecated - what is the preferred API?

查看:36
本文介绍了Hadoop DistributedCache 已弃用 - 首选 API 是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的地图任务需要一些配置数据,我想通过分布式缓存分发这些数据.

My map tasks need some configuration data, which I would like to distribute via the Distributed Cache.

Hadoop MapReduce 教程显示了DistributedCache类的用法,大致如下:

The Hadoop MapReduce Tutorial shows the usage of the DistributedCache class, roughly as follows:

// In the driver
JobConf conf = new JobConf(getConf(), WordCount.class);
...
DistributedCache.addCacheFile(new Path(filename).toUri(), conf); 

// In the mapper
Path[] myCacheFiles = DistributedCache.getLocalCacheFiles(job);
...

然而,DistributedCache标记为已在 Hadoop 2.2.0 中弃用.

实现这一目标的新首选方法是什么?是否有涵盖此 API 的最新示例或教程?

What is the new preferred way to achieve this? Is there an up-to-date example or tutorial covering this API?

推荐答案

分布式缓存的 API 可以在 Job 类本身中找到.在此处查看文档:http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html代码应该类似于

The APIs for the Distributed Cache can be found in the Job class itself. Check the documentation here: http://hadoop.apache.org/docs/stable2/api/org/apache/hadoop/mapreduce/Job.html The code should be something like

Job job = new Job();
...
job.addCacheFile(new Path(filename).toUri());

在您的映射器代码中:

Path[] localPaths = context.getLocalCacheFiles();
...

这篇关于Hadoop DistributedCache 已弃用 - 首选 API 是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆