使用像Hadoop这样的分布式计算将图像编码到Jpeg2000中 [英] Encoding image into Jpeg2000 using Distributed Computing like Hadoop

查看:73
本文介绍了使用像Hadoop这样的分布式计算将图像编码到Jpeg2000中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是想知道是否有人使用Hadoop将大图像编码/压缩为JPEG2000格式?
http://code.google.com/p/matsu- / b>

图片大小约为1TB +,在单台机器上需要100小时+

解决方案

你在说多大的图片?从看来,平铺和小波变换似乎应该很容易并行 - 瓷砖似乎彼此独立。有一个名为 JasPer 的开源库,似乎已被广泛使用,但它是用C编写的,这会使它集成到Hadoop中变得有点棘手。

您必须将编解码器分离出来,并在映射步骤中调用适当的平铺和编码功能,然后重新组合并在缩小步骤中写出图像。它可能需要对JPEG 2000格式本身有相当深入的理解。



现在的问题是:您将花费多少时间来移动未压缩的数据,然后重新组合它在一台机器上串行处理瓷砖?您可能需要做一些信封计算,看看它是否值得,以及理论加速比较在单台机器上进行比较。


Just wondering if anybody has done/aware about encoding/compressing large image into JPEG2000 format using Hadoop ? There is also this http://code.google.com/p/matsu-project/ which uses map reduce to process the image.

Image size is about 1TB+ and on single machine it takes 100Hour+

解决方案

How large of an image are you talking about? From the JPEG 2000 Wikipedia page it seems that the tiling and wavelet transformations should be easily parallelizable -- the tiles appear to be independent of each other. There is an open source library called JasPer that appears to be fairly widely used, but it is written in C which will make it a bit tricky integrating into Hadoop.

You will essentially have to part the codec out and call the appropriate tiling and ecoding functions in the map step and the reassemble and write out the image in the reduce step. It will probably require a fairly deep understanding of the JPEG 2000 format itself.

The question is: how much time will you spend moving the uncompressed data around and then reassembling it compared to processing the tiles serially on a single machine? You might want to do some back of the envelope calculations to see if it is worth it and what the theoretical speedup would be compared to doing it on a single machine.

这篇关于使用像Hadoop这样的分布式计算将图像编码到Jpeg2000中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆