为什么使用逆阿克曼函数来描述Kruskal算法的复杂性? [英] Why is the Inverse Ackermann function used to describe complexity of Kruskal's algorithm?

查看:449
本文介绍了为什么使用逆阿克曼函数来描述Kruskal算法的复杂性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在一类用于算法分析的类中,向我们展示了Kruskal算法的伪代码:

In a class for analysis of algorithms, we are presented with this pseudocode for Kruskal's algorithm:

然后针对不相交的森林陈述以下内容:

He then states the following, for disjoint-set forests:


m的序列MAKE-SET,UNION和FIND-SET操作,其中n个
是MAKE-SET操作,可以在最坏情况下通过等级和路径压缩对不相交集林
进行联合 O(mα(n))

用于计算第2步和第5-8步的复杂度

Used to compute the complexity of Step 2, and steps 5-8


对于已连接的G:| E | ≥| V | -1; m = O(V + E),n = O(V);

For connected G: |E| ≥ |V| -1; m = O(V + E), n = O(V);

所以第2步,第5-8步: O((V + E)α (V))= O(Eα(V))

So Steps 2, 5-8: O((V + E) α(V)) = O(E α(V))

α(V)= O(lg V)= O(lg E);因此我们得到O(E lg E) ----- //这里的α(V)等于多少?

α(V) = O(lg V) = O(lg E); so we obtain O(E lg E) ----- // how is α(V) equal here?

Kruskal:步骤3、5-8和步骤4:O(E lg E)

观察:| E | < | V | 2-> lg E = O(lg V)

Observe: |E| < |V|2 -> lg E = O(lg V)

因此,克鲁斯卡尔复杂度:O(E lg V)

So, Kruskal complexity: O(E lg V)

我试图理解 alpha(n) /α(n)函数背后的逻辑,从我看来,简而言之, Ackermann函数是一个以惊人的速度快速增长的函数,而逆函数是一个以对数的速度缓慢增长的函数。

I have attempted to understand the logic behind this "alpha(n)"/"α(n)" function, and from what I've read it seems that, simplistically, the Ackermann function is one that grows exponentially incredibly fast, and the inverse is one that grows logarithmically incredibly slowly.

如果我的解释是正确的,那么α(n)代表?这是否意味着MAKE-SET操作最多为O(lg n)?为什么/为什么需要使用逆阿克曼算法?我印象中此操作已执行V次(对于每个顶点)。之后,α(V)也简化为O(lg V)= O(lg E),这是否意味着最大程度地可以用O(lg V)表示α(V)?

If my interpretation is correct, what does "α(n)" represent? Does it mean that MAKE-SET operations are at most O(lg n)? How/Why is using inverse-Ackermann necessary? I was under the impression this operation is performed V times (for each vertex). Following this, α(V) is also simplified to O(lg V) = O(lg E), does this mean that, at a maximum, α(V) may be represented by O(lg V)?

此外,为什么 | E | < | V | ^ 2-> lg E = O(lg V)声明,如何知道| E | < | V | ^ 2?

Also, why is the |E| < |V|^2 -> lg E = O(lg V) statement made, how is it known that that |E| < |V|^2?

我想我的问题可以归结为,为什么不相交集的森林表示似乎比实现的更有效当我的讲师说他们都是O(E log V)时显示链接列表?因此,用森林实现不相交集的难度增加了吗?

I think my question really boils down to, why is it that a "forest" representation of disjoint sets seems to be more efficient than those implemented with linked lists when my lecturer states they are both O(E log V)? Therefore is there a point in the increased difficulty of implementing disjoint sets with forests?

推荐答案

α(V)= O(lg V )是一种常见的符号滥用现象,实际上我们有α(V)∈O(lg V)(V的逆阿克曼是函数O(lg V)的成员)。它们不相等,甚至不是同一类型,一个是函数,另一个是一组函数。

α(V) = O(lg V) is a common abuse of notation, really we have α(V) ∈ O(lg V) (inverse-Ackerman of V is a member of the set of functions O(lg V)). They're not equal, they're not even the same type, one is a function and the other is a set of functions.


怎么知道| E | < | V |²?

how is it known that that |E| < |V|²?

一个完整的无向图有多少条边?您不能拥有更多。您可以在一个多图中进行操作,但这不是该算法所要执行的操作,并且将其扩展到多图上也没有用-只需丢弃除一对节点之间的最佳边缘以外的所有边缘即可。

How many edges does a complete undirected graph have? You can't have more than that. You could in a multigraph, but that's not what the algorithm operates on, and it's useless to extend it to multigraphs - just throw out all but the best edge between a pair of nodes.


为什么当我的讲师说它们都是O(E log V)时,不连续集的森林表示似乎比用链表实现的更有效?

why is it that a "forest" representation of disjoint sets seems to be more efficient than those implemented with linked lists when my lecturer states they are both O(E log V)?

出于某些原因,这是一个很奇怪的问题。首先,您是通过Kruskals算法(而不是通过它自己)有效地测量了不相交集的效率。 他们是您的问题,是Kruskals算法的两个实现。其次,正如您确实认识到的那样,上限的推导使用α(V)∈O(lg V)。因此,它故意忽略了显着差异。这是有道理的,因为时间复杂度在排序步骤中渐近地占主导地位,而仅仅是大O中看不到差异并不意味着它就不存在。

This is a weird thing to ask for several reasons. First, you're effectively measuring the efficiency of disjoint sets through Kruskals algorithm, not by its own. The "they" is your question is two implementations of Kruskals algorithm. Secondly, as you surely realized, the derivation of an upper bound used α(V) ∈ O(lg V). So it deliberately ignores a significant difference. That makes sense, because the time complexity is asymptotically dominated by the sorting step, but just because a difference is invisible in a big O doesn't mean it isn't there.

因此,在森林中实现不相交集的难度增加了吗?

Therefore is there a point in the increased difficulty of implementing disjoint sets with forests?

确实没有增加难度。这是一种非常简单的数据结构,您只需5分钟即可编写,只需两个数组和一些简单的代码-链接列表实际上可能会更难,尤其是在必须执行手动内存管理的情况下。请注意,在Kruskals算法的上下文之外,就渐近时间和实际时间而言,差异是巨大的。

There is no increased difficulty really. It's a super easy data structure that you can write in 5 minutes, just two arrays and some simple code - linked lists may actually be harder, especially if you have to do manual memory management. Note that outside the context of Kruskals algorithm, the difference is huge in terms of both asymptotic time and actual time.

但是即使在Kruskals算法的上下文中,第二个即使在最坏情况下不显示渐近时间,该算法的阶段也显然可以使总时间更好。 FWIW,您也可以改进第一阶段,可以使用堆(或它的更先进的替代品之一),并且只能在线性时间内 heapify 边缘。然后,算法的第二阶段将一个接一个地提取它们,但是至关重要的是,您通常不必提取每一个边缘-您可以跟踪剩余的不交集并在它停止时停止下降到1,可能会导致许多(甚至大多数)边未使用。在最坏的情况下没有帮助,但在现实生活中却有帮助。在某些特殊情况下,应用任何快速排序(计数排序,存储桶排序等)时,可以比O(E log E)更快地对边缘进行排序。

But even in the context of Kruskals algorithm, improving the second stage of the algorithm obviously makes the total time better, even if it doesn't show in the worst case asymptotic time. FWIW you can improve the first stage too, you can use a heap (or one of its fancier drop-in replacements) and only heapify the edges in linear time. Then the second stage of the algorithm will extract them one by one but, crucially, you typically don't have to extract every edge - you can keep track of how many disjoint sets are left and stop when it drops to 1, potentially leaving many (even most) edges unused. In the worst case that doesn't help, but in real life it does. And in special cases you can sort the edges faster than O(E log E), when any of the fast sorts (counting sort, bucket sort, etc) apply.

这篇关于为什么使用逆阿克曼函数来描述Kruskal算法的复杂性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆