Clojure中的惯用模式函数 [英] Idiomatic Mode function in Clojure

查看:137
本文介绍了Clojure中的惯用模式函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习Clojure,并想了解一些关于惯用法的建议。作为小型统计包的一部分,我有一个函数来计算一组数据的模式。 (背景:该模式是一组数据中最常见的值,有几十种公开的算法来计算它。这里使用的是Fundamentals of Biostatistics,第6版,Bernard Rosner。)

 (defn tally-map 
创建一个映射,其中键是输入
序列中的所有唯一元素,这些值表示这些元素
出现的次数。注意,这些键可能不会被格式化为传统的Clojure
键,即符号前面的冒号
[aseq]
(apply merge-with +(map(fn [x] {x 1})aseq)))

(defn mode
计算模式Rosner第13页是有问题的,它可能
不是唯一的,并且对于特定的数据组可能不存在
如果有一个唯一模式,则返回它。如果有多个
如果没有模式,那么所有的
元素都以相等的频率出现,返回nil。
[aseq]
(let [amap tally-map aseq)
mx(apply max(vals amap))
k(keys amap)
f(fn [x] x)))
模式(filter f(map#(if(= mx(get amap%))%)k))
]
(第一模式)
(每个?(= mx%)(vals amap))nil
:其他模式)


我有几个问题:


  1. 参数。该函数接受单个序列。它是更惯用的接受可变数量的参数,如加法函数?

  2. 代码气味。看起来像let比它应该有点复杂 - 这么多变量赋值。我错过了任何明显的(或不那么明显的)语言或库的使用,使这个方法更简洁吗?

解决方案

以下是我的资料:


  1. 有很多核心的clojure函数以序列作为参数,而另一些函数采用多个参数,所以在我看来没有真正的惯用的方式。如果你已经有一个序列中的数据,我将使用seq作为参数,因为它将保存你的调用应用。


  2. 编写一个在某些情况下返回值的函数和在其他情况下返回值的列表,因为调用代码在使用它之前总是必须检查返回值。相反,我会返回一个单一模式作为seq只有一个项目。但是你可能有你的理由,这取决于调用这个函数的代码。


除此之外,我将重写模式函数,如下所示:

 (defn mode [aseq] 
(let [amap(tally-map aseq)
mx(apply max(vals amap))
模式(映射键(过滤器#(= mx(val%))amap))
c(计数模式)]
(cond
(= c 1) (= c(count amap))nil
:默认模式)))

定义一个函数f,你可以使用identity函数(除非你的数据包含逻辑上为false的值)。但你甚至不需要。我以不同的方式找到模式,这对我更加可读:地图amap作为一个地图条目(键值对)序列。首先我只过滤那些值为mx的条目。然后我在这些键上映射键函数,给我一系列键。



要检查是否有任何模式,我不会再次循环地图。相反,我只是比较模式的数量与地图条目的数量。如果它们相等,所有元素都有相同的频率!



这里的函数总是返回一个seq:

 (defn modes [aseq] 
(let [amap(tally-map aseq)
mx(apply max(vals amap))
modes key(filter#(= mx(val%))amap))]
(when(<(count modes)(count amap)))))
/ pre>

I'm learning Clojure and would like some advice on idiomatic usage. As part of a small statistics package, I have a function to calculate the mode of a set of data. (Background: The mode is the most common value in a set of data. There are almost a dozen published algorithms to calculate it. The one used here is from "Fundamentals of Biostatistics" 6th Ed by Bernard Rosner.)

(defn tally-map
 " Create a map where the keys are all of the unique elements in the input
   sequence and the values represent the number of times those elements
   occur. Note that the keys may not be formatted as conventional Clojure
   keys, i.e. a colon preceding a symbol."
  [aseq]
  (apply merge-with + (map (fn [x] {x 1}) aseq)))

(defn mode
 " Calculate the mode. Rosner p. 13. The mode is problematic in that it may
   not be unique and may not exist at all for a particular group of data.
   If there is a single unique mode, it is returned. If there are multiple
   modes, they are returned as a list. If there is no mode, that is all
   elements are present in equal frequency, nil is returned."
  [aseq]
  (let [amap (tally-map aseq)
        mx (apply max (vals amap))
        k (keys amap)
        f (fn [x] (not (nil? x)))
        modes (filter f (map #(if (= mx (get amap %)) %) k))
        ]
    (cond (= 1 (count modes)) (first modes)
      (every? #(= mx %) (vals amap)) nil
      :else modes)
    )
  )

There are a couple of things I have questions about:

  1. The argument. The function accepts a single sequence. Is it more idiomatic to accept a variable number of arguments like the addition function?
  2. Code smell. It seems like the "let" is a bit more complicated than it should be -- so many variable assignments. Have I missed any obvious (or not so obvious) uses of the language or library that would make this method more concise?

Thanks in advance for the help.

解决方案

Here's my take:

  1. There are many core clojure functions that take sequences as arguments while others take multiple arguments, so there is no real idiomatic way in my opinion. If you already have your data in a sequence, I would use a seq as argument, since it will save you a call to apply.

  2. I wouldn't write a function that returns a value in some cases and a list of values in others, because the calling code will always have to check the return value before using it. Instead I would return a single mode as a seq with just one item in it. But you may have your reasons, depending on the code that calls this function.

Apart from that I would rewrite the mode function like this:

(defn mode [aseq]
  (let [amap (tally-map aseq)
        mx (apply max (vals amap))
        modes (map key (filter #(= mx (val %)) amap))
        c (count modes)]
    (cond
      (= c 1) (first modes)
      (= c (count amap)) nil
      :default modes)))

Instead of defining a function f you could use the identity function (unless your data contains values that are logically false). But you don't even need that. I find the modes in a different way, which is more readable to me: The map amap acts as a sequence of map entries (key-value pairs). First I filter only those entries that have the value mx. Then I map the key function on these, giving me a sequence of keys.

To check whether there are any modes I don't loop over the map again. Instead I just compare the number of modes to the number of map entries. If they are equal, all elements have the same frequency!

Here's the function that always returns a seq:

(defn modes [aseq]
  (let [amap (tally-map aseq)
        mx (apply max (vals amap))
        modes (map key (filter #(= mx (val %)) amap))]
    (when (< (count modes) (count amap)) modes)))

这篇关于Clojure中的惯用模式函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆