函数式编程:列表是否只包含唯一项目? [英] Functional Programming: Does a list only contain unique items?

查看:110
本文介绍了函数式编程:列表是否只包含唯一项目?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个未排序的列表,并且想知道它中的所有项目是否都是唯一的。

我的天真方法是

  val l = List(1,2,3,4,3)
def isUniqueList(l:List [Int])=(new HashSet()++ l).size == l.size



基本上,我正在检查包含列表中所有元素的Set是否具有相同的大小(因为项目在原始中出现两次列表只会出现一次),但我不确定这是否是这个问题的理想解决方案。



编辑:
我测试了三种最流行的解决方案, l == l.distinct l.size == l.distinct.size 和Alexey的基于HashSet的解决方案。
每个函数运行1000次,包含10个项目的唯一列表,10000个项目的唯一列表,以及第三个季度中出现的一个项目的相同列表被复制到列表的中间。在每次运行之前,每个函数被调用1000次以预热JIT,整个基准运行5次,然后用System.currentTimeMillis执行时间。
该机器为C2D P8400(2.26 GHz),内存为3GB,Java版本为OpenJDK 64位服务器虚拟机(1.6.0.20)。 java的参数是-Xmx1536M -Xms512M

结果:

 l.size == l .distinct.size(3,571,2,6492)
l == l.distinct(3,5601,26054)
Alexey的HashSet(2,1590,3,781)
$ b $ p对大对象的结果(字符串从1KB到5KB):

 l.size == l.distinct.size MutableList(4,5566,7,6506)
l == l.distinct MutableList(4,5926,3,6075)
Alexey的HashSet MutableList(2,2341,3,784)

使用HashSets的解决方案绝对是最快的,因为他已经指出使用.size并没有太大区别。

解决方案

这是我能想到的最快速的纯功能解决方案: code> def isUniqueList(l:List [T])= isUniqueList1(l,new HashSet [T])

@tailrec
def isUniqueList1(l:List [T], s:Set [T])= l match {
case Nil => true
case(h :: t)=> if(s(h))false else isUniqueList1(t,s + h)
}

这应该会更快,但使用可变数据结构(基于Vasil Remeniuk给出的 distinct 实现):

<$对于(x < - 这个),pre $ def isUniqueList(l:List [T]):Boolean = {
val seen = mutable.HashSet [A] ){
if(seen(x)){
return false
}
else {
seen + = x
}
}






这里是最简单的(相当于你的):

  def isUniqueList(l:List [T])= l.toSet.size == l.size 


I'm having an unsorted list and want to know, whether all items in it are unique.
My naive approach would be

val l = List(1,2,3,4,3)
def isUniqueList(l: List[Int]) = (new HashSet()++l).size == l.size

Basically, I'm checking whether a Set containing all elements of the list has the same size (since an item appearing twice in the original list will only appear once in the set), but I'm not sure whether this is the ideal solution for this problem.

Edit: I benchmarked the 3 most popular solutions, l==l.distinct, l.size==l.distinct.size and Alexey's HashSet-based solution. Each function was run 1000 times with a unique list of 10 items, a unique list of 10000 items and the same lists with one item appearing in the third quarter copied to the middle of the list. Before each run, each function got called 1000 times to warm up the JIT, the whole benchmark was run 5 times before the times were taken with System.currentTimeMillis. The machine was a C2D P8400 (2.26 GHz) with 3GB RAM, the java version was the OpenJDK 64bit server VM (1.6.0.20). The java args were -Xmx1536M -Xms512M

The results:

l.size==l.distinct.size (3, 5471, 2, 6492)
l==l.distinct           (3, 5601, 2, 6054)
Alexey's HashSet        (2, 1590, 3, 781)

The results with larger objects (Strings from 1KB to 5KB):

l.size==l.distinct.size MutableList(4, 5566, 7, 6506)
l==l.distinct           MutableList(4, 5926, 3, 6075)
Alexey's HashSet        MutableList(2, 2341, 3, 784)

The solution using HashSets is definitely the fastest, and as he already pointed out using .size doesn't make a major difference.

解决方案

Here is the fastest purely functional solution I can think of:

def isUniqueList(l: List[T]) = isUniqueList1(l, new HashSet[T])

@tailrec
def isUniqueList1(l: List[T], s: Set[T]) = l match {
  case Nil => true
  case (h :: t) => if (s(h)) false else isUniqueList1(t, s + h)
}

This should be faster, but uses mutable data structure (based on the distinct implementation given by Vasil Remeniuk):

def isUniqueList(l: List[T]): Boolean = {
  val seen = mutable.HashSet[A]()
  for (x <- this) {
    if (seen(x)) {
      return false
    }
    else {
      seen += x
    }
  }
  true
}

And here is the simplest (equivalent to yours):

def isUniqueList(l: List[T]) = l.toSet.size == l.size

这篇关于函数式编程:列表是否只包含唯一项目?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆