B-tree比AVL还是RedBlack-Tree? [英] B-tree faster than AVL or RedBlack-Tree?

查看:191
本文介绍了B-tree比AVL还是RedBlack-Tree?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道性能永远不是黑白的,一般情况下,一个实现在X和更慢的情况下更快,如果Y等等,但一般来说,B-tree比AVL还是RedBlack-Trees更快?实现AVL树(甚至RedBlack-tree?)的复杂程度要复杂得多,但是它们是否更快?(复杂度是否有所回报)?


$ b $编辑:我还要补充说,如果它们更快,那么等效的AVL / RedBlack树(在节点/内容方面) - 为什么是更快?

解决方案

Sean的帖子(当前接受的)包含几个不正确的声明。对不起,肖恩,我不是粗鲁的;我希望我可以说服你,我的说法是事实上的。


他们的用例完全不同,所以不可能进行比较。


它们都用于维护一组具有快速查找,插入和删除的完全有序的项目。它们具有相同的界面和相同的意图。


RB树通常是用于提供快速访问的内存中结构(理想情况下为O(logN ))到数据。 [...]


总是 O(log n)


B树通常是基于磁盘的结构,因此本身比内存中的数据慢。


废话。当您将搜索树存储在磁盘上时,通常使用B树。这是真的当您将数据存储在磁盘上时,访问速度比内存中的数据慢。但是,存储在磁盘上的红黑树也比存储在内存中的红黑树慢。



你正在比较苹果和橘子在这里真正有趣的是比较内存中的B树和内存中的红黑树。



[除此之外:B树,而不是红黑树在理论上在I / O模型中是有效的。我已经通过实验测试(并验证)I / O模型进行排序;我希望它也适用于B树。]


B树很少是二叉树,孩子数量节点可以具有大量的数据。


要清楚,B树节点的大小范围是树的参数(在C ++中,您可能需要使用整数值作为模板参数)。


B树结构的管理可以是数据变化非常复杂。


我记得他们比红黑树更容易理解(并实现) / p>


B树尝试最小化磁盘访问次数,以便数据检索是合理确定的。


< blockquote>

这是真的。


看到像4 B-在非常数据库中查找一些数据所需的树访问。


Go t数据?


在大多数情况下,我会说内存中的RB树更快。


获取数据?


因为查找是二进制的,很容易找到东西。每个节点B树可以有多个子节点,所以在每个节点上,您必须扫描节点来寻找适当的子节点。这是一个O(N)操作。


每个节点的大小是一个固定参数,因此即使进行线性扫描这是O(1)。如果我们大于每个节点的大小,请注意,您通常会将数组排序为O(log n)。


在一个RB树上,它将是O(logN),因为你做了一个比较,然后分支。


你是比较苹果和橘子。 O(log n)是因为树的高度至多为O(log n),就像B树一样。



另外,除非你会用红黑色的树木玩耍分配技巧,似乎有理由推测B树具有更好的缓存行为(它访问一个数组,而不是遍布整个地方的指针,并且具有较少的分配开销,增加了内存的局部性) ),可能会帮助它在速度赛中。



我可以指出实验证据表明,B-tree(具体参数为32和64)具有很强的竞争力红黑色的树木为小尺寸,并优于它的手,甚至中等大的n值。请参阅 http://idlebox.net /2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html



B树更快。为什么?我推测这是由于内存的位置,更好的缓存行为和更少的指针追逐(如果不一样的东西,在一定程度上重叠)。


I know that performance never is black and white, often one implementation is faster in case X and slower in case Y, etc. but in general - are B-trees faster then AVL or RedBlack-Trees? They are considerably more complex to implement then AVL trees (and maybe even RedBlack-trees?), but are they faster (does their complexity pay off) ?

Edit: I should also like to add that if they are faster then the equivalent AVL/RedBlack tree (in terms of nodes/content) - why are they faster?

解决方案

Sean's post (the currently accepted one) contains several incorrect claims. Sorry Sean, I don't mean to be rude; I hope I can convince you that my statement is based in fact.

They're totally different in their use cases, so it's not possible to make a comparison.

They're both used for maintaining a set of totally ordered items with fast lookup, insertion and deletion. They have the same interface and the same intention.

RB trees are typically in-memory structures used to provide fast access (ideally O(logN)) to data. [...]

always O(log n)

B-trees are typically disk-based structures, and so are inherently slower than in-memory data.

Nonsense. When you store search trees on disk, you typically use B-trees. That much is true. When you store data on disk, it's slower to access than data in memory. But a red-black tree stored on disk is also slower than a red-black tree stored in memory.

You're comparing apples and oranges here. What is really interesting is a comparison of in-memory B-trees and in-memory red-black trees.

[As an aside: B-trees, as opposed to red-black trees, are theoretically efficient in the I/O-model. I have experimentally tested (and validated) the I/O-model for sorting; I'd expect it to work for B-trees as well.]

B-trees are rarely binary trees, the number of children a node can have is typically a large number.

To be clear, the size range of B-tree nodes is a parameter of the tree (in C++, you may want to use an integer value as a template parameter).

The management of the B-tree structure can be quite complicated when the data changes.

I remember them to be much simpler to understand (and implement) than red-black trees.

B-tree try to minimize the number of disk accesses so that data retrieval is reasonably deterministic.

That much is true.

It's not uncommon to see something like 4 B-tree access necessary to lookup a bit of data in a very database.

Got data?

In most cases I'd say that in-memory RB trees are faster.

Got data?

Because the lookup is binary it's very easy to find something. B-tree can have multiple children per node, so on each node you have to scan the node to look for the appropriate child. This is an O(N) operation.

The size of each node is a fixed parameter, so even if you do a linear scan, it's O(1). If we big-oh over the size of each node, note that you typically keep the array sorted so it's O(log n).

On a RB-tree it'd be O(logN) since you're doing one comparison and then branching.

You're comparing apples and oranges. The O(log n) is because the height of the tree is at most O(log n), just as it is for a B-tree.

Also, unless you play nasty allocation tricks with the red-black trees, it seems reasonable to conjecture that B-trees have better caching behavior (it accesses an array, not pointers strewn about all over the place, and has less allocation overhead increasing memory locality even more), which might help it in the speed race.

I can point to experimental evidence that B-trees (with size parameters 32 and 64, specifically) are very competitive with red-black trees for small sizes, and outperforms it hands down for even moderately large values of n. See http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html

B-trees are faster. Why? I conjecture that it's due to memory locality, better caching behavior and less pointer chasing (which are, if not the same things, overlapping to some degree).

这篇关于B-tree比AVL还是RedBlack-Tree?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆