了解MyISAM记录结构 [英] Understanding MyISAM record structure

查看:258
本文介绍了了解MyISAM记录结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解MyISAM如何物理存储其记录,以及如何在记录插入和记录删除后维护其结构。我已阅读以下链接:








  • 10:存储引擎第196页第7段说明


    对于具有可变长度的记录,格式更复杂。第一个字节包含一个描述记录子类型的特殊代码。随后的字节的含义随着每个子类型而变化,但是共同的主题是存在包含记录的长度,块中未使用的字节的数量,空值指示器标志以及可能的指针的字节序列如果记录不适合先前创建的空间并且必须拆分,则继续记录。这可能发生在一个记录被删除,并且一个新的插入到它的位置超过原始的一个是大小。您可以通过学习storage / myisam / mi_dynrec.c中的开关语句in_mi_get_block_info()来获取不同代码的含义的详细信息。


    根据该段落,只有当要插入的新数据不能容纳在先前分配的块中时,旧的记录才会被链接数据覆盖。这可能会导致许多many肿的行。



    其他问题




    如果表已被删除并插入了很多次,那么这个表非常低效,因为记录结构可能充满溢出指针和未使用的空间?


    从我以前的答案,会有很多块有




    • 空格

    • 记录长度

    • 区块中未使用的字节数

    • 空值指示器标志



    此记录(如果记录不符合先前创建的空间并且必须拆分,则指向该记录的延续)链接将从已插入大小过大的数据的每一行的前面开始。这可能会使MyISAM表 .MYD 文件很快膨胀。



    SUGGESTIONS



    MyISAM的默认行格式是Dynamic。当表是Dynamic并且遇到大量的INSERT,UPDATE和DELETE时,这样的表需要用

      OPTIMIZE TABLE mytable; 

    有一种替代方法:将表格的行格式切换为Fixed。这样,所有行的大小都相同。这是你如何使行格式固定:

      ALTER TABLE mytable ROW_FORMAT = Fixed; 

    即使使用固定行格式,必须花费时间查找可用记录, O(1)搜索时间(在外行人的术语中,无论表具有多少行或具有多少删除行,它将花费相同的时间来定位可用记录)。您可以绕过此步骤,方法是启用 concurrent_insert 如下:



    将此添加到my.cnf

      [mysqld] 
    concurrent_insert = 2

    。只要执行

      mysql> SET GLOBAL concurrent_insert = 2; 

    这将导致所有INSERT在表空间中找到。 p>

    固定行表的优点




    • INSERT,UPDATE和DELETE



    • 固定行格式的速度更快





      固定行表的缺点



      在大多数情况下,当您运行 ALTER TABLE mytable ROW_FORMAT = Fixed ; ,表可能增长80-100%。 .MYI 文件(MyISAM表的索引页)也将以相同的速率增长。



      EPILOGUE



      如果你想要MyISAM表的速度,并且可以使用更大的表,我需要替代的建议。如果要节省每个MyISAM表的空间,请保持行格式不变(动态)。您将不得不使用 OPTIMIZE TABLE mytable; 压缩表,动态表更频繁。


      I am trying to understand how MyISAM physically store its records and how it maintains its structure after record insertion and record deletion. I have read the following link:

      I want to make sure if I understand it correctly, please correct me if it is not right.

      Fixed-sized record

      • Delete marker determines whether record is deleted or not deleted.
      • Record header holds which column of a row contains NULL value
      • The length of data is fixed.

      Variable-sized record

      • Delete marker is replaced with BLOCK_DELETED block type
      • Record header holds length of data and length of unused data

      • A single record can be seperated into multiple block connected by overflow pointer.

      Deletion

      • For variable-sized record, change block type to BLOCK_DELETED
      • Maintain double linked-list of all deleted record by having the previous pointer of the newly deleted record points to last deleted record. Then, the last deleted record's next pointer points to the newly deleted record.
      • For fixed-sized record, simply change delete marker as deleted. (unsure if they use double linked-list to connect all the deleted record with fixed-sized record)

      Insertion

      • If there is no unused space (deleted records), append the data at the end of the file
      • If there is unused space that fits the newly inserted record, write the new record there.
      • If there is unused space that is far bigger than newly inserted record, split into two records: the new record and the deleted record.
      • If there is unused space that is smaller than newly inserted record, write data there, have overflow pointer to points to the unfitted data at other block.

      Updating

      • What if users update existed data with longer data? Will MyISAM marked the record as deleted and find place that fits the new data or simply use overflow pointer to point to unfitted data?

      Recap the question again

      I want to make sure if I understand it correctly, please correct me if it is not right.

      Additional questions

      • Would it be very inefficient if the table has been deleted and inserted for many times since the record structure could potentially full of overflow pointers and unused space?

      解决方案

      The information you have in the question concerning MyISAM is right on target. However, I would like to address your two additional questions:

      LATEST QUESTION

      What if users update existed data with longer data? Will MyISAM marked the record as deleted and find place that fits the new data or simply use overflow pointer to point to unfitted data?

      According to the Book

      Chapter 10 : "Storage Engines" Page 196 Paragraph 7 says

      For records with variable length, the format is more complicated. The first byte contains a special code describing the subtype of the record. The meaning of the subsequent bytes varies with each subtype, but the common theme is that there is a sequence of bytes that contains the length of the record, the number of unused bytes in the block, NULL value indicator flags, and possibly a pointer to the continuation of the record if the record did not fit into the previously created space and had to be split up. This can happen when one record gets deleted, and a new one to be inserted into its place exceeds the original one is size. You can get the details of the meanings of different codes by studying the switch statement in_mi_get_block_info() in storage/myisam/mi_dynrec.c.

      Based on that paragraph, the old record gets overwritten with linkage data only if the new data to insert cannot fit in the previously allocated block. This can result in many bloated rows.

      ADDITIONAL QUESTION

      Would it be very inefficient if the table has been deleted and inserted for many times since the record structure could potentially full of overflow pointers and unused space?

      From my previous answer, there would be lots of blocks that have

      • block of space
      • the length of the record
      • the number of unused bytes in the block
      • NULL value indicator flags
      • possibly a pointer to the continuation of the record if the record did not fit into the previously created space and had to be split up

      Such record links would start in the front of every row that have oversized data being inserted. This can bloat a MyISAM tables .MYD file very quickly.

      SUGGESTIONS

      The default row format of a MyISAM is Dynamic. When a table is Dynamic and experiences lots of INSERTs, UPDATEs, and DELETEs, such a table would need to optimized with

      OPTIMIZE TABLE mytable;
      

      There is an alternative: switch the table's row format to Fixed. That way, all rows are the same size. This is how you make the row format Fixed:

      ALTER TABLE mytable ROW_FORMAT=Fixed;
      

      Even with a Fixed Row Format, time must be taken to locate an available record but the time would be O(1) search time (In layman's terms, it would take the same amount of time to locate an available record no matter how many rows the table has or how many deleted rows there are). You could bypass that step by enabling concurrent_insert as follows:

      Add this to my.cnf

      [mysqld]
      concurrent_insert = 2
      

      MySQL restart not required. Just run

      mysql> SET GLOBAL concurrent_insert = 2;
      

      This would cause all INSERTs to go to the back of the table without looking for free space.

      Advantage of Fixed Row tables

      • INSERTs, UPDATEs, and DELETEs would be somewhat faster
      • SELECT are 20-25% faster

      Here are some of my posts on SELECT being faster for Row Formats being Fixed

      Disadvantage of Fixed Row tables

      In most cases, when you run ALTER TABLE mytable ROW_FORMAT=Fixed;, the table may grow 80-100%. The .MYI file (index pages for the MyISAM table) would also grow at the same rate.

      EPILOGUE

      If you want speed for MyISAM tables and can live with bigger tables, my alternate suggestions would be needed. If you want to conserve space for each MyISAM table, leave the row format as is (Dynamic). You will have to compress the table with OPTIMIZE TABLE mytable; more frequent with Dynamic tables.

      这篇关于了解MyISAM记录结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆