Haskell中的游程编码 [英] Run Length Encoding in Haskell

查看:55
本文介绍了Haskell中的游程编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

import Data.List

data Encoding = Multiple Int Char | Single Char deriving (Eq,Show,Ord)

游程长度的编码

encode :: String -> [Encoding]
encode inputString =encoding (group inputString) []


encoding :: [String] -> [Encoding] -> [Encoding]
encoding groupString xs=
if (length groupString == 0)
    then xs
else
    case (head groupString) of
            ([c]) ->encoding (tail groupString)  (xs ++ [Single c])
            (x) -> encoding (tail groupString)  (xs ++ [Multiple (length x) (head x)])

游程长度解码

decode :: [Encoding] -> String
decode listString = decoding listString []              

decoding :: [Encoding] -> String -> String
decoding inputString xs=
if (length inputString == 0)
    then xs
else
    case (head inputString) of
        (Single x) ->decoding (tail inputString) (xs++ [x])
        (Multiple num x) ->decoding (tail inputString) (xs ++ (replicate num x) )

这是我的行程编码解决方案,谁能建议我一种更好的编写编码和解码功能的方法

This is my run length encoding solution, can anyone suggest me a better way to write encoding and decoding functions

推荐答案

您的许多代码专用于更新累加器.您在该累加器的尾部添加元素,这将对性能产生巨大影响.

A lot of your code is dedicated to updated an accumulator. You add elements to the tail of that accumulator, and this will have a drastic impact on performance.

之所以通常效率不高,是因为Haskell [a]中的列表至少在概念上被实现为链接列表.结果,将两个列表l1l2l1 ++ l2串联在一起,将花费 O(| l1 |)时间(即l1中的元素数).因此,这意味着如果列表中已经包含100个元素,则在末尾添加一个额外元素将需要大量工作.

The reason this is usually not very efficient is because a list in Haskell [a] is implemented - at least conceptually - as a linked list. As a result concatenating two lists l1 and l2 together with l1 ++ l2, will take O(|l1|) time (that is the number of elements in l1). So that means that if the list already contains 100 elements, adding one extra at the end will require a lot of work.

另一个反模式是使用headtail.是的,可以使用这些功能,但是不幸的是,由于您不使用模式匹配,因此可能会传递一个空列表,然后headtail会出错.

Another anti-pattern is the use of head and tail. Yes these functions can be used, but unfortunately since you do not use pattern matching, it could happen that an empty list is passed, and then head and tail will error.

这里的另一个问题是您在列表上使用length.由于有时Haskell中的列表可以具有无限长,因此length调用(如果需要)将永无止境.

Another problem here is that you use length on a list. Since sometimes lists in Haskell can have infinite length, the length call will - if we need it - never ends.

在必须在累加器末尾附加的情况下,通常我们可以将递归写在要构建的列表"cons"的末尾.因此,我们可以从以下位置重写程序:

In cases where you have to append at the end of the accumulator, usually we can write the recursion in the tail of a list "cons" we are building. So we can rewrite our program from:

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  = ...

现在的问题是我们如何计算这些值.我们可以使用 span :: (a -> Bool) -> [a] -> ([a],[a]) ,该函数将针对给定的谓词和列表构造一个2元组,其中第一项包含所有元素都满足谓词的列表的前缀,第二项包含其余元素列表中的,所以我们可以这样使用:

The question is now how we can count these values. We can use span :: (a -> Bool) -> [a] -> ([a],[a]), a function that will - for a given predicate and a list - construct a 2-tuple where the first item contains the prefix of the list where all elements satisfy the predicate, and the second item contains the rest of the list, so we can use this like:

encode :: String -> [Encoding]
encode [] = []
encode (h:t)  | nh > 1 = Multiple nh h : tl
              | otherwise = Single h : tl
    where (s1, s2) = span (h ==) t
          nh = 1 + length s1
          tl = encode s2

例如:

Prelude Data.List> encode "Foobaaar   qqquuux"
[Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']

对于解码,我们再次不需要累加器,可以使用

For the decoding, we again do not need an accumulator, and can make use of replicate :: Int -> a -> [a]:

decode :: [Encoding] -> String
decode [] = []
decode (Single h:t) = h : decode t
decode (Multiple nh h) = replicate nh h ++ decode t

或者我们可以改用列表单子:

Or we can use a list monad instead:

decode :: [Encoding] -> String
decode = (>>= f)
    where f (Single h) = [h]
          f (Multiple nh h) = replicate nh h

例如:

Prelude Data.List> decode [Single 'F',Multiple 2 'o',Single 'b',Multiple 3 'a',Single 'r',Multiple 3 ' ',Multiple 3 'q',Multiple 3 'u',Single 'x']
"Foobaaar   qqquuux"

这篇关于Haskell中的游程编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆