对Linux文件系统的历史透视 [英] Historical perspective to Linux Filesystems

查看:302
本文介绍了对Linux文件系统的历史透视的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Jonathan Leffler在如何可以我发现一些指定文件的大小?是发人深省的。
$ b



  1. - 文件存储在页面上;

  2. 通常情况下,使用
    的空间比b $ b要多,因为一个1字节的文件(经常)占用
    一页(可能是512字节)。
    精确值不一样 - 在第七版Unix文件
    系统的日子里
    更容易(尽管不是微不足道的那么



4-5。如果你想把
inode引用的
间接块考虑为以及原始数据块)。

有关零件的问题


  1. page的定义是什么?

  2. 为什么后面想到的一页(可能是512字节)中的可能这个词?

  3. 为什么在第七版Unix文件系统中更容易测量确切的大小?

  4. 间接封锁的定义是什么?






  5. $ b $ $ b

    出现的历史问题

    I。勒弗勒谈到的历史背景是什么?

    二。有没有改变
    的定义?

    /en.wikipedia.org/wiki/Block_%28data_storage%29rel =nofollow noreferrer> Block(数据存储)尽管在关联所有关键字方面过于繁琐,但仍是信息丰富的。


    计算(特别是数据传输和数据存储),一个块是一个 bytes 比特,具有标称长度(块大小)。据说这样构造的数据被阻塞。将数据放入块的过程称为阻塞。阻塞被用来促进由接收数据的计算机程序处理数据流。被阻塞的数据一次只能读取整个数据块。在将数据存储到9轨磁带时,几乎普遍采用了 软盘硬盘光盘和NAND 闪存



    大多数文件系统基于 block device ,这是抽象。在传统的文件系统中,一个块可能只包含一个文件的一部分。由于内部碎片,这导致空间效率低下,因为文件长度通常不是块大小的倍数,因此最后一个文件块将保持部分空白。这将创建松散空间,平均每个文件一半的块。一些较新的文件系统试图通过称为块子分配 tail merge

    还有一个经典的 Unix文件系统的合理概述。



    传统上,硬盘几何形状(磁盘本身的块布局)一直是 CHS


    • :每个磁性读写器a)拼盘;可以移入和移出以访问不同的圆柱体

    • 圆柱:在圆盘旋转时通过磁头的磁道
    • 扇区:连续存储在圆柱体的一部分上的恒定大小的数据量;驱动器可以处理的最小数据单元



    现在,CHS的使用并不多,因为


    • 硬盘不再使用每个柱面的恒定数量的扇区。更多的数据通过使用每个扇区的恒定弧长而不是恒定的旋转角度挤压在盘片上,所以外部气缸上的扇区多于内部气缸上的扇区。
    • 在ATA规范中,驱动器每个气缸可以具有不多于每个气缸2 16个气缸,2个4个气缸以及2 8个扇区;与512B扇区,这是一个128GB的限制。通过BIOS INT13,通过CHS无法访问超过7.88GB的任何内容。
    • 为了向后兼容,较大的驱动器仍然声称具有CHS几何(否则DOS不会能够启动),但是获得更高的数据需要使用LBA寻址。
    • CHS甚至在RAID或非循环介质上都没有意义。


    • $ p
      $ b

      但由于历史原因,这会影响块大小:因为扇区大小几乎总是512B,文件系统块大小一直是512B的倍数。 (有一个运动正在进行介绍1kB和4kB扇区大小的驱动器,但兼容性看起来相当痛苦。)一般来说,较小的文件系统块大小在存储许多小文件时会导致浪费更少的空间(除非正在使用诸如尾部合并之类的高级技术)而较大的块大小减少了外部碎片,并且在较大的磁盘上开销较低。文件系统的块大小通常是2的幂,被块设备的扇区大小限制在下面,而且通常受操作系统页面大小限制。



      页面大小因操作系统和平台而异(在Linux的情况下,可能会有所不同通过配置以及)。像块大小一样,较小的块大小减少了内部碎片,但是需要更多的管理开销。在32位平台上的4kB页面大小是常见的。

      现在来描述间接块。在UFS设计中,


      • inode 描述了一个文件。
      • 在UFS设计中,inode可容纳的数据块的指针数量非常有限(小于16)。
      • 对于小文件,指针可以直接指向组成文件的数据块。

      • 对于较大的文件,必须有间接指针,指向只包含更多指向块的指针的块。这些可能是指向属于该文件的数据块的直接指针,或者如果该文件非常大,则可能是更间接的指针。


        <因此,当使用间接指针时,文件所需的存储量可能会大于包含其数据的块。



        并非所有文件系统都使用此方法用于跟踪属于文件的数据块。 FAT 只是简单地使用一个单一的文件分配表,这个表格实际上是一个巨大的一系列链表,现代文件系统使用范围


        Jonathan Leffler's comment in the question "How can I find the Size of some specified files?" is thought-provoking. I will break it into parts for analysis.

        1. -- files are stored on pages;

        2. you normally end up with more space being used than that calculation gives because a 1 byte file (often) occupies one page (of maybe 512 bytes).

        3. The exact values vary - it was easier in the days of the 7th Edition Unix file system (though not trivial even then

        4-5. if you wanted to take account of indirect blocks referenced by the inode as well as the raw data blocks).

        Questions about the parts

        1. What is the definition of "page"?
        2. Why is the word "maybe" in the after-thought "one page (of maybe 512 bytes)"?
        3. Why was it easier to measure exact sizes in the "7th Edition Unix file system"?
        4. What is the definition of "indirect block"?
        5. How can you have references by two things: "the inode" and "the raw data blocks"?

        Historical Questions Emerged

        I. What is the historical context Leffler is speaking about?

        II. Have the definitions changed over time?

        解决方案

        As usual for Wikipedia pages, Block (data storage) is informative despite being far too exuberant about linking all keywords.

        In computing (specifically data transmission and data storage), a block is a sequence of bytes or bits, having a nominal length (a block size). Data thus structured is said to be blocked. The process of putting data into blocks is called blocking. Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data. Blocked data is normally read a whole block at a time. Blocking is almost universally employed when storing data to 9-track magnetic tape, to rotating media such as floppy disks, hard disks, optical discs and to NAND flash memory.

        Most file systems are based on a block device, which is a level of abstraction for the hardware responsible for storing and retrieving specified blocks of data, though the block size in file systems may be a multiple of the physical block size. In classical file systems, a single block may only contain a part of a single file. This leads to space inefficiency due to internal fragmentation, since file lengths are often not multiples of block size, and thus the last block of files will remain partially empty. This will create slack space, which averages half a block per file. Some newer file systems attempt to solve this through techniques called block suballocation and tail merging.

        There's also a reasonable overview of the classical Unix File System.

        Traditionally, hard disk geometry (the layout of blocks on the disk itself) has been CHS.

        • Head: the magnetic reader/writer on each (side of a) platter; can move in and out to access different cylinders
        • Cylinder: a track that passes under a head as the platter rotates
        • Sector: a constant-sized amount of data stored contiguously on a portion the cylinder; the smallest unit of data that the drive can deal with

        CHS isn't used much these days, as

        • Hard disks no longer use a constant number of sectors per cylinder. More data is squeezed onto a platter by using a constant arclength per sector rather than a constant rotational angle, so there are more sectors on the outer cylinders than there are on the inner cylinders.
        • By the ATA specification, a drive may have no more than 216 cylinders per head, 24 heads, and 28 sectors per cylinder; with 512B sectors, this is a limit of 128GB. Through BIOS INT13, it is not possible to access anything beyond 7.88GB through CHS anyways.
        • For backwards-compatibility, larger drives still claim to have a CHS geometry (otherwise DOS wouldn't be able to boot), but getting to any of the higher data requires using LBA addressing.
        • CHS doesn't even make sense on RAID or non-rotational media.

        but for historical reasons, this has affected block sizes: because sector sizes were almost always 512B, filesystem block sizes have always been multiples of 512B. (There is a movement afoot to introduce drives with 1kB and 4kB sector sizes, but compatibility looks rather painful.)

        Generally speaking, smaller filesystem block sizes result in less wasted space when storing many small files (unless advanced techniques like tail merging are in use), while larger block sizes reduce external fragmentation and have lower overhead on large disks. The filesystem block size is usually a power of 2, is limited below by the block device's sector size, and is often limited above by the OS's page size.

        The page size varies by OS and platform (and, in the case of Linux, can vary by configuration as well). Like block size, smaller block sizes reduce internal fragmentation but require more administrative overhead. 4kB page sizes on 32-bit platforms is common.

        Now, on to describe indirect blocks. In the UFS design,

        • An inode describes a file.
        • In the UFS design, the number of pointers to data blocks that an inode could hold is very limited (less than 16). The specific number appears to vary in derived implementations.
        • For small files, the pointers can directly point to the data blocks that compose a file.
        • For larger files, there must be indirect pointers, which point to a block which only contains more pointers to blocks. These may be direct pointers to data blocks belonging to the file, or if the file is very large, they may be even more indirect pointers.

        Thus the amount of storage required for a file may be greater than just the blocks containing its data, when indirect pointers are in use.

        Not all filesystems use this method for keeping track of the data blocks belong to a file. FAT simply uses a single file allocation table which is effectively a gigantic series of linked lists, and many modern filesystems use extents.

        这篇关于对Linux文件系统的历史透视的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆