统计列表中每个元素的频率 [英] Count frequency of each element in a list

查看:105
本文介绍了统计列表中每个元素的频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  In:aabbcabb

我试着编写一个程序来计算列表中每个元素的频率。
Out:[(a,3),(b,4),(c,1)]

您可以在以下链接查看我的代码: http://codepad.org/nyIECIT2
在这段代码中,唯一函数的输出就像这样:

  In:aabbcabb
输出:abc

使用独特的输出我们将计算目标的频率名单。
您也可以在这里看到代码:

  frequencyOfElt xs = ans 
其中ans = countElt xs)xs
unique [] = []
unique xs =(head xs):unique(filter((/ =)(head xs))xs))
countElt ref target = ('''),(==',b'),(==' ,c')](filter。(==))ref)(repeat target)

错误:输入语法错误(意外符号unique)

但是在ghci 6.13中也显示其他类型的错误



几个问我什么是使用[(=='a'),(==',b'),(==',c')]的目的。
我所期望的:如果ref =abc和target =aabbaacc
,那么

  zipWith ($)(map filter ref)(repeat target)

会显示[aaaa,bb ,cc]然后我可以使用地图长度超过这个来获得频率
在这里根据ref使用[(=='a'),(==',b'))来过滤列表, (==',c')]



我假设一些逻辑错误[[=='a'),(==',b'), =',c')] here。

解决方案

你没有说你是否想自己写,或者可以从一些标准函数中编写它。




$ b

  import Data.List 

gs = map(\ x - >([head x],length x))。组。分类$ s

- g =地图(head&&& length)。组。排序 - 没有

是标准的quick-n-dirty代码方式它是。






好的,您最初的想法是无码点风格。某些曲调在我脑海中播放......):
$ b

  frequencyOfElt ::(Eq a)= > [a]  - > [(a,Int)] 
frequencyOfElt xs = countElt(唯一xs)xs - 更改结果类型
其中
unique [] = []
unique(x:xs )= x:unique(filter(/ = x)xs)

countElt ref target = - 编码无点式(您的原始想法)
zip
ref $ - 您的原始类型需要(map(:[])ref)
map length $
zipWith($) - ((filter。(==))c)===(filter (== c))
(zipWith($)(repeat(filter。(==)))ref)
(repeat target)

我将这里的类型更改为更合理的 [a] - > [(a,Int)] 顺便说一句。请注意,那
$ b

zipWith($)fs(repeat z)=== map($ z )fs
zipWith($)(repeat f)zs === map(f $)zs === map f zs

因此代码简化为
$ b

  countElt ref target = 
zip
ref $
地图长度$
地图($ target)
(zipWith($)(repeat(filter。(==)))ref)

然后

 zip 
ref $
地图长度$
地图($ target)$
地图(过滤器。 (==))ref

map f $ map g xs == = map(fg)xs ,所以
$ b

  countElt ref target = 
zip
ref $
地图(长度。 ($目标)。过滤器。 (==))ref - (1)

这有点更清楚了(对我来说)用一个列表理解书写,

  countElt ref target = 
[(c,( ($ target)。filter。(==))c)| (< -ref]
== [(c,length(($ target)(filter(== c))))| c <-ref]
== [(c,length $ filter(== c)target)| (< -ref)

这给了我们一个想法, / p>

  countElt ref target = 
zip *< map(length。(`filter` target)。(==))$ ref

在这里,无点代码变得毫无意义。






因此,回到可读的列表解析,使用一个标准的 nub 函数,它相当于你的独特的,你的想法变成了

import Data.List

frequencyOfElt xs = [(c,length $ filter(== c) xs)| c < - nub xs]

这个算法实际上是二次的(〜n ^ 2 ),所以它比第一个版本更糟糕,因为它是以> sort 为主的,即linearithmic(〜n log(n))。




这个代码虽然可以进一步操作,等价转换:
$ b

= [(c,length。filter(== c)$ sort xs )| c <-nub xs]

...因为在列表中搜索与搜索相同一个列表,排序。在这里做更多的工作 - 它会得到回报吗?..


$ b $

= [(c,length 。filter(== c)$ sort xs)| (c:_)< - group $ sort xs]

...对吗?但现在, group 已经将它们分组为(==),所以不需要 filter 调用重复 group 完成的工作:

  = [(c,length。get c。group $ sort xs)| (c:_)<  -  group $ sort xs] 
其中get c gs = fromJust。 find((== c).head)$ gs

= [(c,length g)| g @(c:_)< - group $ sort xs]

= [(head g,length g)| g< - group(sort xs)]

=(map(head&& length).group。sort)xs
pre>

不是吗?在这里,从本文开始的同一个线性算法算法,通过分解其隐藏的常见计算,实际上从您的代码中派生,使其可用于重用和代码简化。

I try to write a program which will count the frequency of each element in a list.

    In: "aabbcabb"
    Out: [("a",3),("b",4),("c",1)]

You can view my code in the following link: http://codepad.org/nyIECIT2 In this code the output of unique function would be like this

     In: "aabbcabb"
     Out: "abc"

Using the output of unique we wil count the frequency of the target list. You can see the code here also:

    frequencyOfElt xs=ans
       where ans=countElt(unique xs) xs
          unique []=[]
      unique xs=(head xs):(unique (filter((/=)(head xs))xs))
      countElt ref target=ans'
             where ans'=zip ref lengths
            lengths=map length $ zipWith($)(map[(=='a'),(==',b'),(==',c')](filter.(==))ref)(repeat target)

    Error:Syntax error in input (unexpected symbol "unique") 

But in ghci 6.13 other type of error are showing also

Few asked me what is the purpose of using [(=='a'),(==',b'),(==',c')]. What I expect: If ref="abc" and target="aabbaacc" then

    zipWith($) (map filter ref)(repeat target)

will show ["aaaa","bb","cc"] then I can use map length over this to get the frequency Here for filtering list according with the ref i use [(=='a'),(==',b'),(==',c')]

I assume some logical error lies [(=='a'),(==',b'),(==',c')] here..

解决方案

You didn't say whether you want to write it whole on your own, or whether it's OK to compose it from some standard functions.

import Data.List

g s = map (\x -> ([head x], length x)) . group . sort $ s

-- g = map (head &&& length) . group . sort     -- without the [...]

is the standard quick-n-dirty way to code it.


OK, so your original idea was to Code it Point-Free Style (certain tune playing in my head...):

frequencyOfElt :: (Eq a) => [a] -> [(a,Int)]
frequencyOfElt xs = countElt (unique xs) xs     -- change the result type
  where 
    unique [] = []
    unique (x:xs) = x : unique (filter (/= x) xs)  

    countElt ref target =   -- Code it Point-Free Style  (your original idea)
      zip 
        ref $               -- your original type would need (map (:[]) ref) here
        map length $
          zipWith ($)       -- ((filter . (==)) c) === (filter (== c))
            (zipWith ($) (repeat (filter . (==))) ref)  
            (repeat target)

I've changed the type here to the more reasonable [a] -> [(a,Int)] btw. Note, that

zipWith ($) fs (repeat z) === map ($ z) fs
zipWith ($) (repeat f) zs === map (f $) zs === map f zs

hence the code simplifies to

    countElt ref target =  
      zip 
        ref $              
        map length $
          map ($ target)      
            (zipWith ($) (repeat (filter . (==))) ref)  

and then

    countElt ref target =  
      zip 
        ref $              
        map length $
          map ($ target) $
            map (filter . (==)) ref

but map f $ map g xs === map (f.g) xs, so

    countElt ref target =  
      zip 
        ref $              
        map (length . ($ target) . filter . (==)) ref      -- (1)

which is a bit clearer (for my taste) written with a list comprehension,

    countElt ref target =  
        [ (c, (length . ($ target) . filter . (==)) c) | c <- ref] 
     == [ (c,  length ( ($ target) ( filter (== c))))  | c <- ref]     
     == [ (c,  length $ filter (== c) target)          | c <- ref]     

Which gives us an idea to re-write (1) further as

    countElt ref target =  
      zip <*> map (length . (`filter` target) . (==)) $ ref

but this obsession with point-free code becomes pointless here.


So going back to the readable list comprehensions, using a standard nub function which is equivalent to your unique, your idea becomes

import Data.List

frequencyOfElt xs = [ (c, length $ filter (== c) xs) | c <- nub xs]

This algorithm is actually quadratic (~ n^2), so it is worse than the first version above which is dominated by sort i.e. is linearithmic (~ n log(n)).


This code though can be manipulated further by a principle of equivalent transformations:

  = [ (c, length . filter (== c) $ sort xs) | c <- nub xs]

... because searching in a list is the same as searching in a list, sorted. Doing more work here -- will it pay off?..

  = [ (c, length . filter (== c) $ sort xs) | (c:_) <- group $ sort xs]

... right? But now, group had already grouped them by (==), so there's no need for the filter call to repeat the work already done by group:

  = [ (c, length . get c . group $ sort xs) | (c:_) <- group $ sort xs]
            where get c gs = fromJust . find ((== c).head) $ gs

  = [ (c, length g) | g@(c:_) <- group $ sort xs]

  = [ (head g, length g) | g <- group (sort xs)]

  = (map (head &&& length) . group . sort) xs

isn't it? And here it is, the same linearithmic algorithm from the beginning of this post, actually derived from your code by factoring out its hidden common computations, making them available for reuse and code simplification.

这篇关于统计列表中每个元素的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆