字符串列上的postgresql索引 [英] postgresql index on string column

查看:713
本文介绍了字符串列上的postgresql索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说,我有一个表 ResidentInfo ,在这个表中我有唯一约束 HomeAddress ,这是 VARCHAR 类型。为了将来的查询,我将在此列上添加一个索引。
查询只有操作 = ,我将使用B-TREE模式,因为目前不建议使用哈希模式。

Say, I have a table ResidentInfo, and in this table I have unique constraints HomeAddress, which is VARCHAR type. For future query, I gonna add an index on this column. The query will only have operation =, and I'll use B-TREE pattern since the Hash pattern is not recommended currently.

问题:从效率角度来看,使用B-TREE,你认为我应该添加一个新的列,其中数字1,2,3 ....,N对应不同的homeaddress,而不是添加 HomeAddress 上的索引,我应该在数字列上添加索引吗?

Question: From efficiency view, using B-TREE, do you think I should add a new column with numbers 1,2,3....,N corresponding to different homeaddress, and instead of adding index on HomeAddress, I should add index on the number column?

我问这个问题因为我没有知道索引是如何工作的。

I ask this question because I don't know how index works.

推荐答案

用于简单的相等检查( = ) , varchar text 列上的B-Tree索引很简单,也是最佳选择。它肯定有助于提高性能

For simple equality checks (=), a B-Tree index on a varchar or text column is simple and the best choice. It certainly helps performance a lot.

当然,简单整数的B-Tree索引表现更好。对于初学者来说,比较简单的整数值要快一些。但更重要的是,性能也是索引大小的函数。较大的列意味着每个数据页面的行数较少,意味着必须读取更多页面...

Of course, a B-Tree index on a simple integer performs better. For starters, comparing simple integer values is a bit faster. But more importantly, performance is also a function of the size of the index. A bigger column means fewer rows per data page, means more pages have to be read ...

由于 HomeAddress 无论如何都不是唯一的,它不是一个好的自然主键。我强烈建议使用 代理主键 serial 是显而易见的选择为了那个原因。它的唯一目的是拥有一个简单,快速的主键。

Since the HomeAddress is hardly unique anyway, it's not a good natural primary key. I would strongly suggest to use a surrogate primary key instead. A serial column is the obvious choice for that. Its only purpose is to have a simple, fast primary key to work with.

如果您有其他表引用该表,则效率会更高。您不需要为外键列复制冗长的字符串,而只需要整数列的4个字节。并且您不需要如此级联更新,因为地址必然会发生变化,而代理pk可以保持不变(当然也不一定)。

If you have other tables referencing said table, this becomes even more efficient. Instead of duplicating a lengthy string for the foreign key column, you only need the 4 bytes for an integer column. And you don't need to cascade updates so much, since an address is bound to change, while a surrogate pk can stay the same (but doesn't have to, of course).

您的表格可能如下所示:

Your table could look like this:

CREATE TABLE resident (
   resident_id serial PRIMARY KEY
  ,address text NOT NULL
   -- more columns
);

CREATE INDEX resident_adr_idx ON resident(address);

这导致两个B-Tree索引。 resident_id 上的唯一索引和地址上的普通索引

This results in two B-Tree indexes. A unique index on resident_id and a plain index on address.

有关手册中索引的更多信息

Postgres提供了很多选择 - 但是对于这个简单的案例你不再需要了。

More about indexes in the manual.
Postgres offers a lot of options - but you don't need any more for this simple case.

这篇关于字符串列上的postgresql索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆