缓存感知树实现 [英] Cache-aware tree impementation

查看:175
本文介绍了缓存感知树实现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一棵树,其中每个节点可能有0到 N 个孩子。

I have a tree where every node may have 0 to N children.

用例是以下查询:给定指向两个节点的指针:这些节点在树的同一分支中?

Use-case is the following query: Given pointers to two nodes: Are these nodes within the same branch of the tree?

示例

q(2,7) => true
q(5,4) => false



按书(慢)



直接实现是在每个节点上存储指向父级的指针和指向子级列表的指针。但这会导致性能下降,因为树会分散在内存中,因此无法感知缓存。

By the book (slow)

The straight forward implementation would be to store a pointer to the parent and a pointer to a list of children at each node. But this would lead to bad performance because the tree would be fragmented in memory and therefor not cache-aware.

以紧凑的形式表示树的好方法是什么?整个树大约有100,000个节点。因此,应该有可能找到一种使其完全适合CPU缓存的方法。

What would be a good way to represent the tree in compact form? The whole tree has about 100,000 nodes. So it should be possible to find a way to make it fit completely in the CPU-cache.

例如,二叉树通常是隐式表示为数组,因此非常适合完全存储在CPU缓存中(如果很小的话)足够了。)

Binary trees for example are often represented implicitly as an array and are therefor perfect to be completely stored in the CPU-cache (if small enough).

推荐答案

您可以在分配信息的位置预分配连续内存块对于所有节点。

You can pre-allocate a contiguous block of memory where you concatenate the information for all nodes.

此后,每个节点将只需要一种方法来检索其信息的开头以及该信息的长度。

Afterwards, each node would only need a way to retrieve the beginning of its information, and the length of that information.

在这种情况下,每个节点的信息都可以由父代表示,然后是子代列表(假设没有父代, ie

In this case, the information for each node could be represented by the parent, followed by the list of children (let's assume that we use -1 when there is no parent, i.e. for the root).

例如,对于问题中发布的树,节点1的信息为: -1 23 4 ,节点2的信息为: 1 5 ,依此类推。

For example, for the tree posted in the question, the information for node 1 would be: -1 2 3 4, the information for node 2 is: 1 5, and so on.

通过将这些数组连接起来可以获得连续的数组,结果如下:

The contiguous array would be obtained by concatenating these arrays, resulting in something like:

-1 2 3 4 1 5 1 9 10 1 11 12 13 14 2 3 5 5 5 3 3 4 4 4 15 4

每个节点将使用一些元数据来检索其关联信息。如前所述,此元数据需要包含 startIndex length 例如对于节点3,我们将具有 startIndex = 6 length = 3 ,其中允许检索 1 9 10 子数组,指示其父节点为节点1,其子节点为节点9和10。

Each node would use some metadata to allow retrieving its associated information. As mentioned, this metadata would need to consist of a startIndex and length. E.g. for node 3, we would have startIndex = 6, length = 3, which allows to retrieve the 1 9 10 subarray, indicating that the parent is node 1, and its children are nodes 9 and 10.

此外,元数据信息也可以从一开始就存储在连续的存储块中。元数据每个节点的长度都是固定的(两个值),因此我们可以很容易地获得给定节点索引的特定节点的元数据位置。

In addition, the metadata information can also be stored in the contiguous memory block, at the beginning. The metadata has fixed length for each node (two values), thus we can easily obtain the position of the metadata for a certain node, given its index.

,有关该图的所有信息将存储在一个连续的,易于缓存的内存块中。

In this way, all the information about the graph will be stored in a contiguous, cache-friendly, memory block.

这篇关于缓存感知树实现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆