使用图像创建mapreduce作业作为输入 [英] Create mapreduce job with an image as an input

查看:136
本文介绍了使用图像创建mapreduce作业作为输入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

hadoop和mapreduce的新用户,我想创建一个mapreduce作业来对图像进行一些度量。这就是为什么我想知道我是否可以将图像作为输入映射到mapreduce?如果是的话?任何类型的示例



谢谢 这样做。



在提供的信息有限的情况下,我只能给你一个非常一般的答案。



您需要:
1)您需要编写一个自定义的InputFormat,而不是在HDFS位置(如TextInputFormat和SequenceFileInputFormat)中执行大量文件,它实际上会将每个映射任务传递给Image的HDFS路径名称。从中读取图像并不会太难。



如果您计划在图像在整个框架中传递的Reduce阶段中,您需要:
2)您将需要创建一个实现Writable的ImageWritable类(或者如果您在图像上键入,则为WritableComparable)。在你的write()方法中,你需要将你的图像序列化为一个字节数组。当你这样做时,我要做的是首先向输出写入一个int / long,它是你要写入数组的大小。最后,你需要将数组写为字节。


在您的read()方法中,您将首先读取一个int / long(它将描述图像的有效载荷),创建一个具有此大小的字节数组,然后将字节完全读入你的字节数组中,直到你捕获的int / long的长度为止。



我不完全确定你在做什么,但这就是我想要做的。


New user of hadoop and mapreduce, i would like to create a mapreduce job to do some measure on images. this why i would like to know if i can passe an image as input to mapreduce?if yes? any kind of example

thanks

解决方案

Yes, you can totally do this.

With the limited information provided, I can only give you a very general answer.

Either way, you'll need to: 1) You will need to write a custom InputFormat that instead of taking chunks of files in HDFS locations (like TextInputFormat and SequenceFileInputFormat do), it actually passes to each map task the Image's HDFS path name. Reading the image from that won't be too hard.

If you plan to have a Reduce phase in which Images are passed around through the framework, you'll need to: 2) You will need to make an "ImageWritable" class that implements Writable (or WritableComparable if you're keying on the image). In your write() method, you'll need to serialize your image to a byte array. When you do this, what I would do is first write to the output an int/long which is the size of the array you're going to write. Lastly, you'll want to write the array as bytes.

In your read() method, you'll read an int/long first (which will describe the payload of the image), create an byte array of this size, and then read the bytes fully into your byte array up to the length of your int/long that you captured.

I'm not entirely sure what you're doing, but that's how I'd go about it.

这篇关于使用图像创建mapreduce作业作为输入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆