Haskell - 无法理解一小段代码 [英] Haskell - Having trouble understanding a small bit of code

查看:24
本文介绍了Haskell - 无法理解一小段代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一项学校任务,我得到了一小段示例代码,以后可以使用.我理解这段代码的 90%,但有一个小行/函数,我一生都无法弄清楚它的作用(顺便说一句,我对 Haskell 很陌生).

I am doing a school task where I am given a small bit of sample code which I can use later. I understand 90% of this code but there is one little line/function that I for the life of me can't figure out what it does (I am very new to Haskell btw).

示例代码:

data Profile = Profile {matrix::[[(Char,Int)]], moleType::SeqType, nrOfSeqs::Int, nm::String} deriving (Show)

nucleotides = "ACGT"
aminoacids = sort "ARNDCEQGHILKMFPSTWYVX"

makeProfileMatrix :: [MolSeq] -> [[(Char, Int)]]
makeProfileMatrix [] = error "Empty sequence list"
makeProfileMatrix sl = res
  where 
    t = seqType (head sl)
    defaults = 
      if (t == DNA) then
        zip nucleotides (replicate (length nucleotides) 0) -- Row 1
      else 
        zip aminoacids (replicate (length aminoacids) 0)   -- Row 2
    strs = map seqSequence sl                              -- Row 3
    tmp1 = map (map (\x -> ((head x), (length x))) . group . sort)
               (transpose strs)                            -- Row 4
    equalFst a b = (fst a) == (fst b)
    res = map sort (map (\l -> unionBy equalFst l defaults) tmp1)

{-Row 1: 'replicate' creates a list of zeros that is equal to the length of the 'nucleotides' string. 
This list is then 'zipped' (combines each element in each list into pairs/tuples) with the nucleotides-}

{-Row 2: 'replicate' creates a list of zeros that is equal to the length of the 'aminoacids' string.
This list is then 'zipped' (combines each element in each list into pairs/tuples) with the aminoacids-}

{-Row 3: The function 'seqSequence' is applied to each element in the 'sl' list and then returns a new altered list. 
In other words 'strs' becomes a list that contains the all the sequences in 'sl' (sl contains MolSeq objects, not strings)-}

{-Row 4: (transpose strs) creates a list that has each 'column' of sequences as a element (the first element is made up of each first element in each sequence etc.).
--}

我已经为代码中每个标记的行写了一个解释(我认为到目前为止是正确的)但是当我试图弄清楚第 4 行的作用时我被卡住了.我理解转置"位,但我根本无法弄清楚内部映射函数的作用.据我所知,'map' 函数需要一个列表作为第二个参数才能起作用,但内部 map 函数只有一个匿名函数,但没有可操作的列表.完全清楚我不明白整个内线 map (\x -> ((head x), (length x))) 是什么.团体 .sort 确实如此.请帮忙!

I have written an explanation for each marked Row in the code (which I think so far is correct) but I get stuck when I try to figure out what Row 4 does. I understand the 'transpose' bit but I can't at all figure out what the inner map function does. As far as I know a 'map' function needs a list as a second parameter to function but the inner map function only has an anonymous function but no list to operate on. To be perfectly clear I don't understand what the entire inner line map (\x -> ((head x), (length x))) . group . sort does. Please help!

奖金!:

这是我无法弄清楚的另一段示例代码(从未使用过 Haskell 中的类):

Here is another piece of sample code that I can't figure out (never worked with classes in Haskell):

class Evol object where
 name :: object -> String
 distance :: object -> object -> Double
 distanceMatrix :: [object] -> [(String, String, Double)]
 addRow :: [object] -> Int -> [(String, String, Double)]
 distanceMatrix [] = []
 distanceMatrix object =
  addRow object 0 ++ distanceMatrix (tail object)
 addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num


 -- Determines the name and distance of an instance of "Evol" if the instance is a "MolSeq".
instance Evol MolSeq where
 name = seqName
 distance = seqDistance

 -- Determines the name and distance of an instance of "Evol" if the instance is a "Profile".
instance Evol Profile where
 name = profileName
 distance = profileDistance

特别是这部分:

addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num

如果你不想,你不必解释这个我只是对addRow"实际上试图做什么(详细地)感到有点困惑.

You don't have to explain this one if you don't want to I am just slightly confused as to what 'addRow' actually is trying to do (in detail).

谢谢!

推荐答案

map (\x -> (head x, length x)) .团体 .sort 是生成直方图的惯用方式.当你看到类似这样的东西你不理解时,试着把它分解成更小的部分并在样本输入上测试它们:

map (\x -> (head x, length x)) . group . sort is an idiomatic way of generating a histogram. When you see something like this that you don’t understand, try breaking it down into smaller pieces and testing them on sample inputs:

(\x -> (head x, length x)) "AAAA"
-- ('A', 4)

(group . sort) "CABABA"
-- ["AAA", "BB", "C"]

(map (\x -> (head x, length x)) . group . sort) "CABABA"
map (\x -> (head x, length x)) (group (sort "CABABA"))
-- [('A', 3), ('B', 2), ('C', 1)]

它以 point-free 风格编写,由 3 个函数组成,map (…)groupsort,但也可以写成 lambda:

It’s written in point-free style as a composition of 3 functions, map (…), group, and sort, but could also be written as a lambda:

\row -> map (…) (group (sort row))

对于转置矩阵中的每一行,它都会生成该行数据的直方图.您可以通过格式化并打印出来来获得更直观的表示:

For each row in the transposed matrix, it produces a histogram of the data in that row. You could get a more visual representation of this by formatting it and printing it out:

let
  showHistogramRow row = concat
    [ show $ head row
    , ":\t"
    , replicate (length row) '#'
    ]
  input = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]

putStr
  $ unlines
  $ map showHistogramRow
  $ group
  $ sort input

-- 1:   ##
-- 2:   #
-- 3:   ##
-- 4:   #
-- 5:   ###
-- 6:   #
-- 9:   #

至于这个:

addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num

addRow 列出从 object 中的第一个元素到其他每个元素的距离.它以一种不明显的方式在列表中使用索引,当一个更简单和更惯用的 map 就足够了:

addRow makes a list of the distances from the first element in object to each of the other elements. It uses indexing into the list in a sort of non-obvious way, when a simpler and more idiomatic map would suffice:

addRow object = map (\ b -> (name a, name b, distance a b)) object
  where a = head object

通常最好避免部分函数,例如head,因为它们可能会在某些输入(例如head [])上抛出异常.但是,这里没有问题,因为如果输入列表为空,则永远不会使用 a,因此永远不会调用 head.

Ordinarily it’s good to avoid partial functions such as head because they can throw an exception on some inputs (e.g. head []). Here it’s fine, however, because if the input list is empty, then a will never be used, and so head will never be called.

distanceMatrix 也可以用 map 表示,因为它只是在所有 tails 上调用一个函数 (addRow) 并用 ++ 将它们连接在一起:

distanceMatrix could be expressed with a map as well, because it’s just calling a function (addRow) on all the tails of the list and concatenating them together with ++:

distanceMatrix object = concatMap addRow (tails object)

这也可以用无点风格编写.<代码>\x ->f (g x) 可以写成 f .g;这里,fconcatMap addRowgtails:

This could be written in point-free style too. \x -> f (g x) can be written as just f . g; here, f is concatMap addRow and g is tails:

distanceMatrix = concatMap addRow . tails

Evol 只是描述了可以为其生成distanceMatrix 的一组类型,包括MolSeqProfile.请注意,addRowdistanceMatrix 不需要是此类的成员,因为它们完全根据 name 实现>distance,这样你就可以将它们移到顶层:

Evol just describes the set of types for which you can generate a distanceMatrix, including MolSeq and Profile. Note that addRow and distanceMatrix don‘t need to be members of this class, because they’re implemented entirely in terms of name and distance, so you could move them to the top level:

distanceMatrix :: (Evol object) => [object] -> [(String, String, Double)]
distanceMatrix = concatMap addRow . tails

addRow :: (Evol object) => [object] -> Int -> [(String, String, Double)]
addRow object = map (\ b -> (name a, name b, distance a b)) object
  where a = head object

这篇关于Haskell - 无法理解一小段代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆