什么哈斯克尔重新presentation建议对2D,装箱像素阵列具有千万像素的? [英] What Haskell representation is recommended for 2D, unboxed pixel arrays with millions of pixels?

查看:114
本文介绍了什么哈斯克尔重新presentation建议对2D,装箱像素阵列具有千万像素的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想解决在Haskell一些图像处理的问题。我既黑白(位)和彩色图像具有千万像素的工作。我有一些问题:

I want to tackle some image-processing problems in Haskell. I'm working with both bitonal (bitmap) and color images with millions of pixels. I have a number of questions:


  1. 在我 Vector.Unboxed UArray 之间进行选择依据是什么?他们都是未装箱的数组,但矢量抽象似乎标榜严重,特别是围绕循环融合。为矢量总是更好?如果不是,时,我应该使用哪种重新presentation?

  1. On what basis should I choose between Vector.Unboxed and UArray? They are both unboxed arrays, but the Vector abstraction seems heavily advertised, particular around loop fusion. Is Vector always better? If not, when should I use which representation?

对于彩色图像,我将要存储的16位整数或单precision浮点数的三倍三倍。为了这个目的,要么是矢量 UArray 更容易使用?更好的性能?

For color images I will wish to store triples of 16-bit integers or triples of single-precision floating-point numbers. For this purpose, is either Vector or UArray easier to use? More performant?

有关黑白影像,我将需要存储每个像素只1位。是否有一个predefined数据类型,可以帮助我在这里通过打包多个像素成一个字,还是我对我自己的?

For bitonal images I will need to store only 1 bit per pixel. Is there a predefined datatype that can help me here by packing multiple pixels into a word, or am I on my own?

最后,我的阵列是二维的。我想我可以处理由重presentation作为征收(或载体的载体)数组的数组的额外的间接,但我preFER有索引映射支持的抽象。任何人都可以从标准库或Hackage推荐什么吗?

Finally, my arrays are two-dimensional. I suppose I could deal with the extra indirection imposed by a representation as "array of arrays" (or vector of vectors), but I'd prefer an abstraction that has index-mapping support. Can anyone recommend anything from a standard library or from Hackage?

我是一个函数式程序员,没有必要突变: - )

I am a functional programmer and have no need for mutation :-)

推荐答案

有关多维数组,在Haskell目前最好的选择,在我看来,就是 repa

For multi-dimensional arrays, the current best option in Haskell, in my view, is repa.

Repa提供高性能,定期的,多维度,多态塑造平行阵列。所有的数字数据存储拆箱。自动并联设置与Repa组合程序写入功能,在运行程序时在命令行上提供+ RTS -Nwhatever。

Repa provides high performance, regular, multi-dimensional, shape polymorphic parallel arrays. All numeric data is stored unboxed. Functions written with the Repa combinators are automatically parallel provided you supply +RTS -Nwhatever on the command line when running the program.

最近,它已被用于一些图像处理的问题:

Recently, it has been used for some image processing problems:

  • Real time edge detection
  • Efficient Parallel Stencil Convolution in Haskell

我已经开始写的 对使用repa 的教程,这是一个良好的开端,如果你已经知道哈斯克尔阵列或矢量库。关键的敲门砖是利用形状类型而不是简单的索引类型,以解决多维指数(甚至模板)。

I've started writing a tutorial on the use of repa, which is a good place to start if you already know Haskell arrays, or the vector library. The key stepping stone is the use of shape types instead of simple index types, to address multidimensional indices (and even stencils).

借助 repa-IO 包包括用于读取和写入.BMP图像文件的支持,更多的格式支持,虽然是需要的。

The repa-io package includes support for reading and writing .bmp image files, though support for more formats is needed.

解决您的具体问题,这里是一个图形,以讨论:

Addressing your specific questions, here is a graphic, with discussion:

凭什么我应该Vector.Unboxed和UArray之间进行选择?

它们具有大致相同的基本重新presentation,然而,主要区别在于该API用于与载体工作广度:它们具有几乎所有的操作你通常与列表相关联(与聚变驱动优化框架),而 UArray 几乎没有API。

They have approximately the same underlying representation, however, the primary difference is the breadth of the API for working with vectors: they have almost all the operations you'd normally associate with lists (with a fusion-driven optimization framework), while UArray have almost no API.

对于彩色图像,我会希望存储单precision浮点数的16位整数三倍或三倍。

UArray 对多维数据更好的支持,因为它可以使用任意的数据类型进行索引。虽然这是可能的矢量(通过写 UA 为您的元素类型的实例),它是不是矢量的首要目标 - 相反,这是在 Repa 步骤,使得它非常容易使用自定义数据类型存储在一个有效的方式,这要感谢的形状的索引。

UArray has better support for multi-dimensional data, as it can use arbitrary data types for indexing. While this is possible in Vector (by writing an instance of UA for your element type), it isn't the primary goal of Vector -- instead, this is where Repa steps in, making it very easy to use custom data types stored in an efficient manner, thanks to the shape indexing.

Repa ,您的三重短裤将有类型:

In Repa, your triple of shorts would have the type:

Array DIM3 Word16

也就是说,Word16s的3D阵列。

That is, a 3D array of Word16s.

对于黑白影像,我将需要存储每个像素只1位。

UArrays包BOOLS为位,向量使用它确实做到位为包装布尔实例,而不是使用基于 Word8 重新presentation。 Howver,很容易写向量位包装实施 - 这里是之一,从(过时)uvector库。引擎盖下, Repa 使用向量,所以我认为它继承了图书馆重新presentation选择。

UArrays pack Bools as bits, Vector uses the instance for Bool which does do bit packing, instead using a representation based on Word8. Howver, it is easy to write a bit-packing implementation for vectors -- here is one, from the (obsolete) uvector library. Under the hood, Repa uses Vectors, so I think it inherits that libraries representation choices.

是否有predefined数据类型,可以帮助我在这里通过打包多个像素为一个字

您可以使用现有实例任何库,针对不同类型的词,但是您可能需要编写使用Data.Bits推出并展开分组数据几个助手。

You can use the existing instances for any of the libraries, for different word types, but you may need to write a few helpers using Data.Bits to roll and unroll packed data.

最后,我的阵列是二维

UArray,维修服务支持高效的多维数组。 Repa也有这样丰富的接口。对自己的矢量没有。

UArray and Repa support efficient multi-dimensional arrays. Repa also has a rich interface for doing so. Vector on its own does not.

值得注意提到:


  • HMATRIX ,一个自定义的数组类型具有丰富的绑定线性代数软件包。应绑定使用矢量 repa 类型。

  • IX-成形的,从常规数组获得更灵活的索引

  • 黑板,安迪·吉尔库操纵2D图像

  • codeC-像魔鬼,读取和写入各种图像格式UArray

  • hmatrix, a custom array type with extensive bindings to linear algebra packages. Should be bound to use the vector or repa types.
  • ix-shapeable, getting more flexible indexing from regular arrays
  • chalkboard, Andy Gill's library for manipulating 2D images
  • codec-image-devil, read and write various image formats to UArray

这篇关于什么哈斯克尔重新presentation建议对2D,装箱像素阵列具有千万像素的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆