计算元组大小 [英] Calculate tuple size

查看:147
本文介绍了计算元组大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解列顺序如何最小化PostgreSQL中的表大小。

I am trying to understand how the column order minimize the table size in PostgreSQL.

示例:

CREATE TABLE test (
 column_1 int
,column_2 int
,column_3 bigint
,column_4 bigint
,column_5 text
,column_6 text
,column_7 numeric(5,2)
,column_8 numeric(5,2)
,column_9 timestamp
,column_10 boolean
,column_11 boolean
);

INSERT INTO test
  VALUES(1,1,1,1,'test','test_1',12,12,current_timestamp,true,false);

SELECT pg_column_size(test.*) FROM test;

 pg_column_size 
----------------
       82
    (1 row)

元组大小:

元组头的23字节开销+ 1个字节NULL位图,所以:

23 byte overhead of tuple header + 1 byte for NULL bitmap, so:

24 + 4 + 4 + 8 + 8 + 5 + 7 + 5 + 5 + 8 + 1 + 1 = 80但实际的元组大小是82。

24+4+4+8+8+5+7+5+5+8+1+1=80 but the actual tuple size is 82.

是否有2字节的额外开销?

Is there any extra overhead of 2 byte?

我理解下面链接中给出的示例:

在PostgreSQL中计算和节省空间

I understood the example given on below link:
Calculating and saving space in PostgreSQL

如果我们删除 column_8数值(5,2),则元组大小也保持不变,即:82。

If we remove the column_8 numeric(5,2) then also the tuple size remains the same, i.e.: 82.

我已对表进行重新排序以最小化元组大小并给出80。

I have reorder the table to minimize the tuple size and gives 80.

CREATE TABLE test (
 column_3 bigint
,column_4 bigint
,column_9 timestamp
,column_1 int
,column_2 int
,column_10 boolean
,column_11 boolean
,column_7 numeric(5,2)
,column_8 numeric(5,2)
,column_5 text
,column_6 text);

INSERT INTO test
  VALUES(1,1,current_timestamp,1,1,true,false,12,12,'test','test_1');

SELECT pg_column_size(test) FROM test;

 pg_column_size 
----------------
      80

在PostgreSQL中是否有关于列顺序的建议?

Is there any suggestion for column order in PostgreSQL?

推荐答案

您错过了在 column_9时间戳之前还有另外2个字节的填充,该填充必须以8个字节的倍数开始。

You missed another 2 bytes of padding before column_9 timestamp, which needs to start at a multiple of 8 bytes.


24+4+4+8+8+5+7+5+5+8+1+1=80 but the actual tuple size is 82.  
------------------^ <----- 2 bytes of padding here


这也是原因:


删除 column_8 numeric(5,2),然后元组大小
保持不变,即:82。

If we remove the column_8 numeric(5,2) then also the tuple size remains the same, i.e.: 82.

取出占用5个字节的列后,会在同一位置获得7个字节的填充-最坏的情况。

After taking out that column which occupies 5 bytes you get 7 bytes of padding at the same spot - the worst case.

还请注意,该行实际上在磁盘上占用 88个字节,因为下一个元组保持对齐状态(从 MAXALIGN 的倍数开始,通常为8个字节)。

Also note that this row actually occupies 88 bytes on disk, because the tuple header of the next tuple is left aligned (starts at a multiple of MAXALIGN, typically 8 bytes).

您的已修改行以8字节的倍数结尾不会导致额外的填充,只需要80字节即可。

Your modified row ends at a multiple of 8 bytes does not incur extra padding and needs just the 80 bytes.

尽管两者都需要页面头中的元组指针另外4个字节。

Both need another 4 bytes for the tuple pointer in the page header, though.

这是一个俄罗斯方块游戏,您似乎已经了解了其基本知识。通常情况下,您不会获得太多收益,也不要为之过分思考。但是,有极端的情况。空值会更改游戏每行

It's a game of "column tetris", the basics of which you seem to have understood already. Typically you won't gain much, don't think too hard about it. There are extreme corner cases, though. Null values change the game per row.

您需要了解每种数据类型的大小,对齐方式和填充要求,以及NULL位图。

You need to know size, alignment and padding requirements for each data type and the special rules for the NULL bitmap.

  • Do nullable columns occupy additional space in PostgreSQL?

相关在dba.SE上进行详细计算的答案:

Related answer with detailed calculation on dba.SE:

  • Configuring PostgreSQL for read performance

这篇关于计算元组大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆