Haskell - 无法理解一小段代码 [英] Haskell - Having trouble understanding a small bit of code

查看:191
本文介绍了Haskell - 无法理解一小段代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一个学校的任务,在那里我得到一些稍后可以使用的示例代码。我理解这段代码的90%,但是我有一个小小的行/功能,我无法弄清楚它的功能(我对Haskell btw很新)。

示例代码:

  data Profile = Profile {matrix :: [[(Char,Int)]], moleType :: SeqType,nrOfSeqs :: Int,nm :: String}派生(显示)

核苷酸=ACGT
氨基酸=排序ARNDCEQGHILKMFPSTWYVX

makeProfileMatrix :: [MolSeq] - >
makeProfileMatrix sl = res
其中
t = seqType(head sl)
makeProfileMatrix [] =默认值=
if(t == DNA)然后
zip核苷酸(复制(长度核苷酸)0) - 行1
else
zip aminoacids(replicate(length aminoacids)0 ) - 行2
strs = map seqSequence sl - 行3
tmp1 = map(map(\ x - >((head x),(length x)))。sort )
(转置strs) - 第4行
equalFst ab =(fst a)==(fst b)
res =地图排序(map(\l - > unionBy equalFst l默认值)tmp1)

{-Row 1:'replicate'创建一个等于'核苷酸'字符串长度的零的列表。
然后将这个列表'压缩'(将每个列表中的每个元素与核苷酸结合成对/元组) - }

{--Row 2:'replicate'创建一个零列表等于'氨基酸'字符串的长度。
然后将这个列表'压缩'(将每个列表中的每个元素结合到对/元组中)与氨基酸 - }

{-Row 3:函数'seqSequence'应用于每个元素放在'sl'列表中,然后返回一个新的更改列表。
换句话说,'strs'变成了一个包含'sl'中所有序列的列表(sl包含MolSeq对象,而不是字符串) - }

{-Row 4 :(转置strs )创建一个列表,每个序列的'列'作为一个元素(第一个元素由每个序列中的每个第一个元素组成)。
- }

我已经为代码中的每个标记Row写了一个解释我认为到目前为止是正确的),但当我试图找出第4行时,我陷入了困境。我理解'转置'位,但我根本无法弄清楚内部映射函数的作用。据我所知,一个'map'函数需要一个列表作为函数的第二个参数,但内部map函数只有一个匿名函数,但没有列表来操作。要完全清楚我不明白整个内线 map(\ x - >((head x),(length x)))。组。排序的确如此。请帮助!

奖金!:



这是另一个我无法弄清的示例代码(从来没有在Haskell中使用类):

  class Evol对象其中
name :: object - >字符串
distance :: object - >对象 - > Double
distanceMatrix :: [object] - > [(String,String,Double)]
addRow :: [object] - > Int - > [(String,String,Double)]
distanceMatrix [] = []
distanceMatrix对象=
addRow对象0 ++ distanceMatrix(尾对象)
addRow对象num - 添加行到距离矩阵
| num< length对象=(名字a,名字b,距离a b):addRow对象(num + 1)
|否则= []
其中
a =头对象
b =对象!! num


- 如果实例是MolSeq,则确定Evol实例的名称和距离。
实例Evol MolSeq其中
名称= seqName
距离= seqDistance

- 确定Evol实例的名称和距离(如果实例是简介。
实例Evol Profile其中
name = profileName
distance = profileDistance

特别是这部分:

  addRow object num  - 将行添加到距离矩阵
| num< length对象=(名字a,名字b,距离a b):addRow对象(num + 1)
|否则= []
其中
a =头对象
b =对象!! num

如果你不想让我轻微地解释这一点

谢谢!

解决方案

map(\ x - >(head x,length x))。组。 sort 是生成直方图的惯用方式。当你看到这样的东西你不明白的时候,试着把它分解成更小的片断,然后在样本输入中测试它们:

 (\ x  - >(head x,length x))AAAA
- ('A',4)

(group。sort)CABABA
- [AAA,BB,C]

(map(\ x - >(head x,length x)).group。sort)CABABA
map(\ x - >(head x,length x))(group(sortCABABA))
- [('A',3),('B',2 ),('C',1)]

它是用 em> style作为3个函数的组合, map(...) group 排序,但也可以写成lambda:

  \row  - > map(...)(group(sort row))

对于转置矩阵中的每一行,该行数据的直方图。

  let 
showHistogramRow row = concat您可以通过格式化并打印出来获得更直观的表示形式:


[show $ head row
,:\ t
,replicate(length row)'#'
]
input = [3,1,4 ,1,5,9,2,6,5,3,5]

putStr
$ unlines
$ map showHistogramRow
$ group
$排序输入

- 1:##
- 2:#
- 3:##
- 4:#
- 5:###
- 6:#
- 9:#



<对于这个:

  addRow object num  - 将行添加到距离矩阵
| num< length对象=(名字a,名字b,距离a b):addRow对象(num + 1)
|否则= []
其中
a =头对象
b =对象!! num

addRow 列出距离从 object 中的第一个元素到每个其他元素。它使用一种非显而易见的方式将索引引入列表中,当一个更简单,更习惯的 map 就足够了:

  addRow object = map(\ b  - >(name a,name b,distance ab))object 
where a = head object

通常避免 partial 函数是很好的,例如 head ,因为它们可以在某些输入上引发异常(例如 head [] )。但是,这样做很好,因为如果输入列表为空,那么 a 将永远不会被使用,所以 head 将永远不会被调用。

distanceMatrix 可以用 map ,因为它只是在列表的所有尾部上调用一个函数( addRow ),并连接它们与 ++

  distanceMatrix object = concatMap addRow(tails对象)

这也可以用无点式书写。 \x - > f(g x)可以写成 f。克;在这里, f concatMap addRow g tails

  distanceMatrix = concatMap addRow。尾巴

Evol 只描述了一组类型您可以生成 distanceMatrix ,其中包括 MolSeq 配置文件。请注意, addRow distanceMatrix 不需要成为此类的成员,因为它们完全按照 name 距离,所以您可以将它们移动到顶层:

  distanceMatrix ::(Evol object)=> [object]  - > [(String,String,Double)] 
distanceMatrix = concatMap addRow。尾巴

addRow ::(Evol对象)=> [object] - > Int - > [(String,String,Double)]
addRow object = map(\ b - >(name a,name b,distance ab))object
where a = head object


I am doing a school task where I am given a small bit of sample code which I can use later. I understand 90% of this code but there is one little line/function that I for the life of me can't figure out what it does (I am very new to Haskell btw).

Sample code:

data Profile = Profile {matrix::[[(Char,Int)]], moleType::SeqType, nrOfSeqs::Int, nm::String} deriving (Show)

nucleotides = "ACGT"
aminoacids = sort "ARNDCEQGHILKMFPSTWYVX"

makeProfileMatrix :: [MolSeq] -> [[(Char, Int)]]
makeProfileMatrix [] = error "Empty sequence list"
makeProfileMatrix sl = res
  where 
    t = seqType (head sl)
    defaults = 
      if (t == DNA) then
        zip nucleotides (replicate (length nucleotides) 0) -- Row 1
      else 
        zip aminoacids (replicate (length aminoacids) 0)   -- Row 2
    strs = map seqSequence sl                              -- Row 3
    tmp1 = map (map (\x -> ((head x), (length x))) . group . sort)
               (transpose strs)                            -- Row 4
    equalFst a b = (fst a) == (fst b)
    res = map sort (map (\l -> unionBy equalFst l defaults) tmp1)

{-Row 1: 'replicate' creates a list of zeros that is equal to the length of the 'nucleotides' string. 
This list is then 'zipped' (combines each element in each list into pairs/tuples) with the nucleotides-}

{-Row 2: 'replicate' creates a list of zeros that is equal to the length of the 'aminoacids' string.
This list is then 'zipped' (combines each element in each list into pairs/tuples) with the aminoacids-}

{-Row 3: The function 'seqSequence' is applied to each element in the 'sl' list and then returns a new altered list. 
In other words 'strs' becomes a list that contains the all the sequences in 'sl' (sl contains MolSeq objects, not strings)-}

{-Row 4: (transpose strs) creates a list that has each 'column' of sequences as a element (the first element is made up of each first element in each sequence etc.).
--}

I have written an explanation for each marked Row in the code (which I think so far is correct) but I get stuck when I try to figure out what Row 4 does. I understand the 'transpose' bit but I can't at all figure out what the inner map function does. As far as I know a 'map' function needs a list as a second parameter to function but the inner map function only has an anonymous function but no list to operate on. To be perfectly clear I don't understand what the entire inner line map (\x -> ((head x), (length x))) . group . sort does. Please help!

Bonus!:

Here is another piece of sample code that I can't figure out (never worked with classes in Haskell):

class Evol object where
 name :: object -> String
 distance :: object -> object -> Double
 distanceMatrix :: [object] -> [(String, String, Double)]
 addRow :: [object] -> Int -> [(String, String, Double)]
 distanceMatrix [] = []
 distanceMatrix object =
  addRow object 0 ++ distanceMatrix (tail object)
 addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num


 -- Determines the name and distance of an instance of "Evol" if the instance is a "MolSeq".
instance Evol MolSeq where
 name = seqName
 distance = seqDistance

 -- Determines the name and distance of an instance of "Evol" if the instance is a "Profile".
instance Evol Profile where
 name = profileName
 distance = profileDistance

Especially this part:

addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num

You don't have to explain this one if you don't want to I am just slightly confused as to what 'addRow' actually is trying to do (in detail).

Thanks!

解决方案

map (\x -> (head x, length x)) . group . sort is an idiomatic way of generating a histogram. When you see something like this that you don’t understand, try breaking it down into smaller pieces and testing them on sample inputs:

(\x -> (head x, length x)) "AAAA"
-- ('A', 4)

(group . sort) "CABABA"
-- ["AAA", "BB", "C"]

(map (\x -> (head x, length x)) . group . sort) "CABABA"
map (\x -> (head x, length x)) (group (sort "CABABA"))
-- [('A', 3), ('B', 2), ('C', 1)]

It’s written in point-free style as a composition of 3 functions, map (…), group, and sort, but could also be written as a lambda:

\row -> map (…) (group (sort row))

For each row in the transposed matrix, it produces a histogram of the data in that row. You could get a more visual representation of this by formatting it and printing it out:

let
  showHistogramRow row = concat
    [ show $ head row
    , ":\t"
    , replicate (length row) '#'
    ]
  input = [3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5]

putStr
  $ unlines
  $ map showHistogramRow
  $ group
  $ sort input

-- 1:   ##
-- 2:   #
-- 3:   ##
-- 4:   #
-- 5:   ###
-- 6:   #
-- 9:   #

As for this:

addRow object num  -- Adds row to distance matrix
  | num < length object = (name a, name b, distance a b) : addRow object (num + 1)
  | otherwise = [] 
  where  
        a = head object
        b = object !! num

addRow makes a list of the distances from the first element in object to each of the other elements. It uses indexing into the list in a sort of non-obvious way, when a simpler and more idiomatic map would suffice:

addRow object = map (\ b -> (name a, name b, distance a b)) object
  where a = head object

Ordinarily it’s good to avoid partial functions such as head because they can throw an exception on some inputs (e.g. head []). Here it’s fine, however, because if the input list is empty, then a will never be used, and so head will never be called.

distanceMatrix could be expressed with a map as well, because it’s just calling a function (addRow) on all the tails of the list and concatenating them together with ++:

distanceMatrix object = concatMap addRow (tails object)

This could be written in point-free style too. \x -> f (g x) can be written as just f . g; here, f is concatMap addRow and g is tails:

distanceMatrix = concatMap addRow . tails

Evol just describes the set of types for which you can generate a distanceMatrix, including MolSeq and Profile. Note that addRow and distanceMatrix don‘t need to be members of this class, because they’re implemented entirely in terms of name and distance, so you could move them to the top level:

distanceMatrix :: (Evol object) => [object] -> [(String, String, Double)]
distanceMatrix = concatMap addRow . tails

addRow :: (Evol object) => [object] -> Int -> [(String, String, Double)]
addRow object = map (\ b -> (name a, name b, distance a b)) object
  where a = head object

这篇关于Haskell - 无法理解一小段代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆