ImmmableSortedSet和fsharp Set有什么区别? [英] What's the difference between `ImmutableSortedSet` and fsharp `Set`?
问题描述
I am wondering what's the difference between ImmutableSortedSet
and the native FSharp Set
? It seems that the performance signatures of both are similar. Also I saw somewhere that SortedSet
is implemented as a Red Black Tree, so I guess ImmutableSortedSet
does the same.
fsharp map
的内部实现是什么?是此处声明的红黑树或
What is the internal implementation of fsharp map
? Is is Red Black Tree as claimed here or AVL tree as found out here?
此外,为什么MSDN文档没有明确说明库集合的实际数据结构是什么?我知道这些是实施细节,并且即将更改.我的观点是,如果他们不想将库数据类型绑定到某种类型的众所周知的数据结构,那么就复杂性而言,他们应该至少提供所有方法性能签名的总结吗?
In addition, why MSDN documents don't state clear what the actual data structure is for the library collection? I know these are implementation details and are about to change. My point is that if they don't want to bind the library data type to a certain type of well known data structure, they should at least offer a summery of all the methods performance signatures in terms of complexity?
推荐答案
我想知道ImmutableSortedSet和本机FSharp集之间有什么区别?
I am wondering what's the difference between ImmutableSortedSet and the native FSharp Set?
它们通常非常相似.主要区别在于F#Set
支持快速设置的理论运算(联合,交集和差分).
They are generally very similar. The main difference is that the F# Set
supports fast set theoretic operations (union, intersection and difference).
这是一个简单的F#程序,用于测量某些常见操作的性能:
Here is a simple F# program that measures the performance of some common operations:
open System.Collections.Immutable
while true do
do
let timer = System.Diagnostics.Stopwatch.StartNew()
let cmp = LanguagePrimitives.FastGenericComparer<int>
let mutable s1 = ImmutableSortedSet.Create<int>(cmp)
let mutable s2 = ImmutableSortedSet.Create<int>(cmp)
for i in 1..1000000 do
s1 <- s1.Add i
for i in 1000000..2000000 do
s2 <- s2.Add i
printfn "BCL ImmutableSortedSet: add in %fs" timer.Elapsed.TotalSeconds
timer.Restart()
for _ in 1..10 do
for i in 1..1000000 do
ignore(s1.Contains i)
printfn "BCL ImmutableSortedSet: contains in %fs" timer.Elapsed.TotalSeconds
timer.Restart()
let s = s1.Union s2
printfn "BCL ImmutableSortedSet: union in %fs" timer.Elapsed.TotalSeconds
do
let timer = System.Diagnostics.Stopwatch.StartNew()
let mutable s1 = Set.empty
let mutable s2 = Set.empty
for i in 1..1000000 do
s1 <- s1.Add i
for i in 1000000..2000000 do
s2 <- s2.Add i
printfn "F# Set: %fs" timer.Elapsed.TotalSeconds
timer.Restart()
for _ in 1..10 do
for i in 1..1000000 do
ignore(s1.Contains i)
printfn "F# Set: contains in %fs" timer.Elapsed.TotalSeconds
timer.Restart()
let s = Set.union s1 s2
printfn "F# Set: union in %fs" timer.Elapsed.TotalSeconds
在我的机器上,我得到:
On my machine, I get:
BCL ImmutableSortedSet F# Set
add 2.6s 3.0s
contains 2.1s 1.9s
union 1.1s 0.00004s
因此,F#Set
的构建速度稍慢,搜索的速度略快,但是对于设置的理论联合运算,速度却快了几个数量级.
So the F# Set
is slightly slower to construct and slightly faster to search but orders of magnitude faster for the set theoretic union operation.
fsharp map的内部实现是什么?是这里所说的红黑树还是这里发现的AVL树?
What is the internal implementation of fsharp map? Is is Red Black Tree as claimed here or AVL tree as found out here?
两个链接都表明,F#使用AVL树.
As both of your links state, F# uses AVL trees.
这实际上与上面的性能数据有关. AVL树包含每个分支中子树的最大高度,因此,无需检查整个子树即可重新平衡子树.相反,红黑树在每个分支中只包含一点数据,因此重新平衡子树需要遍历整个树,这在渐近速度上较慢.用外行的话来说,两个相同大小的不重叠集合的并集只需要创建一个包含两个现有树的新分支即可.请注意,BCL API中的Union
甚至不能表达这一点:它处理抽象的IEnumerable
而不是具体的集合.
This is actually relevant in the context of the performance figures above. AVL trees contain the maximum height of a subtree in each branch and, therefore, allow subtrees to be rebalanced without examining the entire subtree. In contrast, red-black trees contain a single bit of data in each branch so rebalancing subtrees requires the entire trees to be traversed which is asymptotically slower. In layman's terms, the union of two same-sized non-overlapping sets entails little more than creating a new branch containing the two existing trees. Note that the Union
in the BCL API cannot even express this: it handles an abstract IEnumerable
rather than a concrete set.
此外,为什么MSDN文档没有明确说明库集合的实际数据结构是什么?我知道这些是实施细节,并且即将更改.我的观点是,如果他们不想将库数据类型绑定到某种类型的众所周知的数据结构,那么就复杂性而言,他们应该至少提供所有方法性能签名的总结吗?
In addition, why MSDN documents don't state clear what the actual data structure is for the library collection? I know these are implementation details and are about to change. My point is that if they don't want to bind the library data type to a certain type of well known data structure, they should at least offer a summery of all the methods performance signatures in terms of complexity?
我同意文档中的复杂性会很好.
I agree that complexities in the docs would be good.
这篇关于ImmmableSortedSet和fsharp Set有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!