哈希索引永远不会是聚簇索引吗? [英] Is a hash index never a clustering index?

查看:236
本文介绍了哈希索引永远不会是聚簇索引吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

来自

From Database System Concepts

我们使用术语哈希索引来表示哈希文件结构以及 次要哈希索引.严格来说,哈希索引仅 次要索引结构. 散列索引永远不需要作为聚簇索引结构,因为如果文件本身是通过散列进行组织的,则不需要 单独的哈希索引结构就可以了.但是,由于哈希文件 组织提供与索引相同的直接访问记录的权限 提供,我们假装通过散列组织的文件还具有一个 集群哈希索引就可以了.

We use the term hash index to denote hash file structures as well as secondary hash indices. Strictly speaking, hash indices are only secondary index structures. A hash index is never needed as a clustering index structure, since, if a file itself is organized by hashing, there is no need for a separate hash index structure on it. However, since hash file organization provides the same direct access to records that indexing provides, we pretend that a file organized by hashing also has a clustering hash index on it.

二级索引"和非聚簇索引"(我从书中了解到)的概念相同吗?

Is "secondary index" the same concept as "nonclustering index" (which is what I understood from the book)?

哈希索引是否永远不会是聚簇索引?

Is a hash index never a clustering index or not?

您能否重新解释或解释为什么永远不需要哈希索引作为聚簇索引结构"的原因是如果文件本身是通过哈希进行组织的,则不需要在其上使用单独的哈希索引结构"?如果文件本身不是不是通过散列来组织的",那又怎么样呢?

Could you rephrase or explain why the reason "A hash index is never needed as a clustering index structure" is "if a file itself is organized by hashing, there is no need for a separate hash index structure on it"? What about "if a file itself is not organized by hashing"?

谢谢.

推荐答案

文本试图解释某些内容,但不幸的是,造成了更多的混乱.

The text tries to explain something but unfortunately creates more confusion than it resolves.

在逻辑级别上,数据库表(正确的术语:关系")由行(正确的术语:元组")组成,这些行代表有关db旨在表示/反映的真实世界的事实.永远不要将这些行/元组称为记录",因为记录"是与物理级别相关的概念,与逻辑上的概念不同.

At the logical level, database tables (correct term : "relations") are made up of rows (correct term : "tuples") which represent facts about the real world the db is aimed to represent/reflect. Don't ever call those rows/tuples "records" because "records" is a concept pertaining to the physical level, which is distinct from the logical.

通常,但这不是一成不变的普遍定律,您会发现物理组织由一个主"数据存储区组成,该数据存储区具有每个元组的记录,并且该记录包含每个属性(列)值元组(行). (除非正在播放LOB左右,否则这些记录必须在存储它们的存储区中指定一个物理位置,并且通常/通常使用主键值上的B树来完成.这样可以方便:

Typically, but this is not a universal law cast in stone, you will find that the physical organization consists of a "main" datastore which has a record for each tuple and where that record contains each and every attribute (column) value of the tuple (row). (That's unless there are LOBs in play or so.) Those records must be given a physical location in the store they are stored in and this is usually/typically done using a B-tree on the primary key values. This facilitates :

  • 仅从关系/表中检索特定的[元组/行与]主键值.
  • 按主键值的顺序遍历[tuple of]关系
  • 从关系/表中仅检索主键值的特定范围内的[元组/行.]

关于主键值的B树通常称为聚类"索引.

This B-tree on the primary key values is typically called the "clustering" index.

通常,也经常需要仅检索不是主键的属性的[具有特定值的元组/行].如果需要尽可能有效/快速地完成主键的值,我们可以使用类似的索引,有时将其称为第二".这些索引通常不包含已索引元组/行的所有属性/列值,而只包含要索引的属性值以及主键值(因此我们可以在"main"中找到其余属性)数据存储.

Often, there is also a frequent need for retrieving only [tuples/rows with] specific values of attributes that are not the primary key. If that needs to be done as efficiently/fast as it can for values of the primary key, we use similar indexes that are then sometimes called "secondary". Those indexes typically do not contain all the attribute/column values of the tuple/row indexed, but only the attribute values to be indexed plus a mention of the primary key value (so we can find the rest of the attributes in the "main" datastore.

这些二级"索引也将主要是B树索引,这将允许按顺序遍历所索引的属性,但它们也可能是哈希索引,仅允许使用相等比较查找元组/行在给定键值(键" =索引键,与关系/表上的键无关的情况下,尽管显然对于表/关系上的大多数键,在索引键具有相同索引的情况下也会有一个专用索引)属性作为它支持的表键).

Those "secondary" indexes will mostly also be B-tree indexes which will permit in-order traversal for the attributes being indexed, but they can potentially also be hashing indexes, which permit only to look up tuples/rows using equality comparisons with a given key value ("key" = index key, nothing to do with the keys on the relation/table, though obviously for most keys on the table/relation, there will be a dedicated index too where the index key has the same attributes as the table key it supports).

最后,主"(/集群")索引不能为哈希索引(从某种意义上来说,这是相反的说法,但这是完全错误的),从理论上讲,这没有任何理论上的原因.但是,鉴于您的教科书中的解释水平不高,可能不希望您被教导那样.

Finally, there is no theoretical reason why a "primary" (/"clustered") index could not be a hash index (the text kinda suggests the opposite but that is plain wrong). But given the poor level of the explanation in your textbook, it is probably not expected of you to be taught that.

还请注意,除了使用B树或哈希索引之外,还有其他物理上组织数据库的方法.

Also note that there are still other ways to physically organize a database than just using B-tree or hash indexes.

所以总结一下:

聚集"通常是指主数据记录存储上的索引 通常是主键上的B树[或类似名称] 而且这本教科书可能不想让您知道更多高级的可能性

"Clustered" usually refers to the index on the primary data records store and is usually a B-tree [or some such] on the primary key and the textbook presumably does not want you to know about more advanced possibilities

次要"通常是指提供其他快速访问特定元组/行"的附加索引. 并且通常也是B树,允许按顺序遍历,就像聚集"/主"索引一样 但也可以是仅允许按给定值访问"而不能按顺序遍历的哈希索引.

"Secondary" usually refers to additional indexes that provide additional "fast access to specific tuples/rows" and is usually also a B-tree that permits in-order traversal just like the "clustered"/"primary" index but can also be a hash index that permits only "access by given value" but no in-order traversal.

希望有帮助.

这篇关于哈希索引永远不会是聚簇索引吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆