哈斯克尔:洗牌数据没有功能依赖 [英] Haskell: shuffling data without functional dependencies

查看:132
本文介绍了哈斯克尔:洗牌数据没有功能依赖的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图对一些数据实施Fisher-Yates混洗。这个算法对于一维数组很容易实现。然而,我需要能够在二维矩阵中混洗数据。

我认为一种可以很好地推广到更高维数组的方法是将我的任意维数矩阵转换为索引的一维数组,然后对这些数组进行混洗,然后通过将索引数组的每个索引处的元素与索引数组元素的索引处的元素进行交换来重新组织该矩阵。换句话说,要采取2×2的矩阵,如:

  1 2 
3 4

code>

我会将其转换为此数组:

  

然后,我会按照正常情况拼凑成 $ b $ (0,(1,0)),(1,(0,1)),(2,((1,1)),(3,(0) ,0))]

重组后,原始矩阵将变为:

  2 3 
4 1

我的基本方法是我想要一个类似下面的类:

  class Shufflable a其中
indices :: a - > Array Int b
reorganize :: a - > Array Int b - > a

然后我有一个函数来执行这个shuffle,看起来像这样:

 <$ c $ (RandomGen g)=> g  - > Array Int b  - >(Arr ay Int b,g)

思路是(减去RandomGen管道)我应该能够随机洗牌这样一个可洗牌的东西:

  shuffle ::(Shufflable a,RandomGen g)=> a  - > g  - > (a,g)
shuffle array = reorganize array(fisherYates(indices array))

这里是我到目前为止:

  { - #LANGUAGE MultiParamTypeClasses,FunctionalDependencies,FlexibleInstances# - } 

模块Shuffle其中

导入Data.Array隐藏(索引)
导入System.Random

fisherYates ::(RandomGen g)=> Array Int e - > g - > (Array Int e,g)
fisherYates arr gen = go max gen arr
where
(_,max)= bounds arr

go 0 g arr =( ($,$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $'$
$ b class Shuffle ab | a - > b where
indices :: a - > Array Int b
reorganize :: a - > Array Int b - > a

shuffle ::(Shuffle a b,RandomGen g)=> a - > g - > (a,g)
shuffle a gen =(重组索引,gen')
其中
(indices,gen')= fisherYates(indices a)gen

实例(Ix ix)=> Shuffle(Array ix e)ix其中
重组a = undefined
indices a = array(0,maxIdx)(zip [0..maxIdx](范围界限))
其中
bound = bounds a
maxIdx = rangeSize bound - 1

swap :: Ix i =>阵列i e - >我 - >我 - >数组即
swap arr ij = arr // [(i,i'),(j,j')]
其中
i'= arr!j
j'= arr!i

我的问题:


  1. 我觉得这是解决一个简单问题的很多语言扩展。理解或编写另一种方式会更容易吗?

  2. 我觉得社区正朝着功能依赖关系的类型族发展。有没有办法使用它来解决这个问题?

  3. 我的一部分想知道是否可以将 fisherYates 函数移入不知何故, Shuffle 类型类。这是可能的和/或值得做的设置,以便您可以实现 shuffle 或实现索引 reorganize

谢谢!您可能想查看修复

>,它提供了 n 维数组,它们将形状<维度>(维度)编码为类型;你可以使用任何形状的数组来处理泛型操作。



我认为你完全可以通过使用 backpermute fromFunction 并转换索引(它比看起来更有效率,因为当你强制它变成一个unboxed数组时;事实上, backpermute 是根据 fromFunction 来实现的)。

repa自身使用了不少的语言扩展,但是为了兼顾性能(repa的数组是unboxed,并且提供的标准操作做了很好的事情,比如自动并行化),您可能会发现它更适合标准库的数组d便利性(IMO repa有比标准数组更好的API)。



这是一个很好的介绍修复

不可否认,这些都不会直接简化您的代码。但是如果repa的数组非常适合你,那么你最终的代码可能会避免你目前的解决方案的许多复杂性。






也就是说,将函数依赖关系转化为一个类型的家族非常简单; Shuffle 类变为

  class Shuffle a where 
type Elt a
indices :: a - > Array Int(Elt a)
reorganize :: a - > Array Int(Elt a) - > a

实例变为

  instance(Ix ix)=> Shuffle(Array ix e)其中
类型Elt(Array ix e)= ix
...

Shuffle ab 约束条件变为 Shuffle a


I'm trying to implement a Fisher-Yates shuffle of some data. This algorithm is easy to implement for one-dimensional arrays. However, I need to be able to shuffle data in a two-dimensional matrix.

An approach which I think could generalize nicely to higher dimensional arrays is to convert my arbitrarily dimensioned matrix to a single-dimensional array of indices, shuffle those, and then reorganize the matrix by swapping the element at each index of this index array with the element at the index of the element of the index array. In other words, to take a 2x2 matrix such as:

1  2
3  4

I would convert this into this "array":

[(0, (0,0)),  (1, (0,1)),  (2, ((1,0)),  (3, (1,1))]

This I would then scramble per normal into, say,

[(0, (1,0)),  (1, (0,1)),  (2, ((1,1)),  (3, (0,0))]

Once reorganized, the original matrix would become:

2  3
4  1

My basic approach here is that I want to have a type class that looks something like this:

class Shufflable a where
  indices    :: a -> Array Int b
  reorganize :: a -> Array Int b -> a

Then I'll have a function to perform the shuffle which looks like this:

fisherYates :: (RandomGen g) => g -> Array Int b -> (Array Int b, g)

The thinking is that (minus the RandomGen plumbing) I should be able to shuffle a shuffleable thing like so:

shuffle :: (Shufflable a, RandomGen g) => a -> g -> (a, g)
shuffle array = reorganize array (fisherYates (indices array))

Here's what I have so far:

{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies, FlexibleInstances  #-}

module Shuffle where

import Data.Array hiding (indices)
import System.Random         

fisherYates :: (RandomGen g) => Array Int e -> g -> (Array Int e, g)
fisherYates arr gen = go max gen arr
    where
      (_, max) = bounds arr

      go 0 g arr = (arr, g)
      go i g arr = go (i-1) g' (swap arr i j)
          where
            (j, g') = randomR (0, i) g

class Shuffle a b | a -> b where
  indices    :: a -> Array Int b
  reorganize :: a -> Array Int b -> a

shuffle :: (Shuffle a b, RandomGen g) => a -> g -> (a, g)
shuffle a gen = (reorganize a indexes, gen')
  where
    (indexes, gen') = fisherYates (indices a) gen

instance (Ix ix) => Shuffle (Array ix e) ix where
  reorganize a = undefined
  indices a    = array (0, maxIdx) (zip [0..maxIdx] (range bound))
      where
        bound = bounds a
        maxIdx = rangeSize bound - 1

swap :: Ix i => Array i e -> i -> i -> Array i e
swap arr i j = arr // [ (i, i'), (j, j') ]
    where
      i' = arr!j
      j' = arr!i

My problems:

  1. I feel like this is a lot of language extensions for solving a simple problem. Would it be easier to understand or write another way?
  2. I feel like the community is moving towards type families over functional dependencies. Is there a way to use that instead to solve this problem?
  3. A part of me wonders if the fisherYates function can be moved into the Shuffle typeclass somehow. Would it be possible and/or worth doing to set this up so that you either implement shuffle or implement both indices and reorganize?

Thanks!

解决方案

You might want to look into repa, which offers n-dimensional arrays which encode their shape (dimensions) into the type; you can code generic operations that work on arrays of any shape with it.

I think you could avoid a typeclass entirely by constructing the array with backpermute or fromFunction and translating the indices (it's more efficient than it looks, since it gets turned into an unboxed array when you force it; in fact, backpermute is implemented in terms of fromFunction under the hood).

repa itself uses quite a few language extensions, but you might find it preferable to the standard library's arrays for reasons of both performance (repa's arrays are unboxed, and the standard operations offered do nice things like automatic parallelisation) and convenience (IMO repa has a nicer API than the standard arrays).

Here's a good introduction to repa.

Admittedly, none of this directly simplifies your code. But if repa's arrays are a good fit for you, then the code you end up with will probably avoid many of the complexities of your current solution.


That said, turning your use of functional dependencies into a type family is really simple; the Shuffle class becomes

class Shuffle a where
  type Elt a
  indices    :: a -> Array Int (Elt a)
  reorganize :: a -> Array Int (Elt a) -> a

the instance becomes

instance (Ix ix) => Shuffle (Array ix e) where
  type Elt (Array ix e) = ix
  ...

and the Shuffle a b constraint becomes Shuffle a.

这篇关于哈斯克尔:洗牌数据没有功能依赖的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆