ImmmableSortedSet和fsharp Set有什么区别? [英] What's the difference between `ImmutableSortedSet` and fsharp `Set`?

查看:109
本文介绍了ImmmableSortedSet和fsharp Set有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道ImmutableSortedSet和本机FSharp Set有什么区别?看起来两者的性能特征是相似的.我还在某处看到SortedSet被实现为一棵红黑树,所以我猜ImmutableSortedSet也是一样.

I am wondering what's the difference between ImmutableSortedSet and the native FSharp Set? It seems that the performance signatures of both are similar. Also I saw somewhere that SortedSet is implemented as a Red Black Tree, so I guess ImmutableSortedSet does the same.

fsharp map的内部实现是什么?是此处声明的红黑树

What is the internal implementation of fsharp map? Is is Red Black Tree as claimed here or AVL tree as found out here?

此外,为什么MSDN文档没有明确说明库集合的实际数据结构是什么?我知道这些是实施细节,并且即将更改.我的观点是,如果他们不想将库数据类型绑定到某种类型的众所周知的数据结构,那么就复杂性而言,他们应该至少提供所有方法性能签名的总结吗?

In addition, why MSDN documents don't state clear what the actual data structure is for the library collection? I know these are implementation details and are about to change. My point is that if they don't want to bind the library data type to a certain type of well known data structure, they should at least offer a summery of all the methods performance signatures in terms of complexity?

推荐答案

我想知道ImmutableSortedSet和本机FSharp集之间有什么区别?

I am wondering what's the difference between ImmutableSortedSet and the native FSharp Set?

它们通常非常相似.主要区别在于F#Set支持快速设置的理论运算(联合,交集和差分).

They are generally very similar. The main difference is that the F# Set supports fast set theoretic operations (union, intersection and difference).

这是一个简单的F#程序,用于测量某些常见操作的性能:

Here is a simple F# program that measures the performance of some common operations:

open System.Collections.Immutable

while true do
  do
    let timer = System.Diagnostics.Stopwatch.StartNew()
    let cmp = LanguagePrimitives.FastGenericComparer<int>
    let mutable s1 = ImmutableSortedSet.Create<int>(cmp)
    let mutable s2 = ImmutableSortedSet.Create<int>(cmp)
    for i in 1..1000000 do
      s1 <- s1.Add i
    for i in 1000000..2000000 do
      s2 <- s2.Add i
    printfn "BCL ImmutableSortedSet: add in %fs" timer.Elapsed.TotalSeconds
    timer.Restart()
    for _ in 1..10 do
      for i in 1..1000000 do
        ignore(s1.Contains i)
    printfn "BCL ImmutableSortedSet: contains in %fs" timer.Elapsed.TotalSeconds
    timer.Restart()
    let s = s1.Union s2
    printfn "BCL ImmutableSortedSet: union in %fs" timer.Elapsed.TotalSeconds

  do
    let timer = System.Diagnostics.Stopwatch.StartNew()
    let mutable s1 = Set.empty
    let mutable s2 = Set.empty
    for i in 1..1000000 do
      s1 <- s1.Add i
    for i in 1000000..2000000 do
      s2 <- s2.Add i
    printfn "F# Set: %fs" timer.Elapsed.TotalSeconds
    timer.Restart()
    for _ in 1..10 do
      for i in 1..1000000 do
        ignore(s1.Contains i)
    printfn "F# Set: contains in %fs" timer.Elapsed.TotalSeconds
    timer.Restart()
    let s = Set.union s1 s2
    printfn "F# Set: union in %fs" timer.Elapsed.TotalSeconds

在我的机器上,我得到:

On my machine, I get:

         BCL ImmutableSortedSet  F# Set
add                2.6s          3.0s
contains           2.1s          1.9s
union              1.1s          0.00004s

因此,F#Set的构建速度稍慢,搜索的速度略快,但是对于设置的理论联合运算,速度却快了几个数量级.

So the F# Set is slightly slower to construct and slightly faster to search but orders of magnitude faster for the set theoretic union operation.

fsharp map的内部实现是什么?是这里所说的红黑树还是这里发现的AVL树?

What is the internal implementation of fsharp map? Is is Red Black Tree as claimed here or AVL tree as found out here?

两个链接都表明,F#使用AVL树.

As both of your links state, F# uses AVL trees.

这实际上与上面的性能数据有关. AVL树包含每个分支中子树的最大高度,因此,无需检查整个子树即可重新平衡子树.相反,红黑树在每个分支中只包含一点数据,因此重新平衡子树需要遍历整个树,这在渐近速度上较慢.用外行的话来说,两个相同大小的不重叠集合的并集只需要创建一个包含两个现有树的新分支即可.请注意,BCL API中的Union甚至不能表达这一点:它处理抽象的IEnumerable而不是具体的集合.

This is actually relevant in the context of the performance figures above. AVL trees contain the maximum height of a subtree in each branch and, therefore, allow subtrees to be rebalanced without examining the entire subtree. In contrast, red-black trees contain a single bit of data in each branch so rebalancing subtrees requires the entire trees to be traversed which is asymptotically slower. In layman's terms, the union of two same-sized non-overlapping sets entails little more than creating a new branch containing the two existing trees. Note that the Union in the BCL API cannot even express this: it handles an abstract IEnumerable rather than a concrete set.

此外,为什么MSDN文档没有明确说明库集合的实际数据结构是什么?我知道这些是实施细节,并且即将更改.我的观点是,如果他们不想将库数据类型绑定到某种类型的众所周知的数据结构,那么就复杂性而言,他们应该至少提供所有方法性能签名的总结吗?

In addition, why MSDN documents don't state clear what the actual data structure is for the library collection? I know these are implementation details and are about to change. My point is that if they don't want to bind the library data type to a certain type of well known data structure, they should at least offer a summery of all the methods performance signatures in terms of complexity?

我同意文档中的复杂性会很好.

I agree that complexities in the docs would be good.

这篇关于ImmmableSortedSet和fsharp Set有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆