stat命令如何计算文件的块? [英] How does stat command calculate the blocks of a file?

查看:168
本文介绍了stat命令如何计算文件的块?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道stat命令如何计算一个文件的块。我阅读了文章,它说:


值st_blocks给出了512字节块的文件大小。 (这可能小于st_size / 512,例如当文件有空洞时。)值st_blksize给出了用于高效文件系统I / O的首选块大小。 (以较小的块写入文件可能导致读取 - 修改 - 重写效率低下。)

但我无法在我的测试中验证。

我的文件系统是ext3。



dumpe2fs -h / dev / sda3显示:

 。 .. 
第一个区块:0
区块大小:4096
区块大小:4096
...

然后运行

pre $ kent @ KentT60:〜/ Desktop $ stat Email
文件:`Email'
大小:965块:8 IO块:4096个常规文件
设备:80ah / 2058d inode:746095链接:1
访问:(0644 / -rw (1000 / kent)Gid:(1000 / kent)
访问:2009-08-11 21:36:36.000000000 +0200
修改:2009-08- 11 21:36:35.000000000 +0200
更改:2009-08-11 21:36:35.000000000 +0200

如果这里的块的意思是:有多少个512字节的块,这个数字应该是2而不是8个。我认为,来自文件系统(io块)的块大小是4k。如果fs将得到电子邮件文件,它将从磁盘取得最小4k(8 x 512字节块),这意味着965/512 + 6 = 8。我不确定猜测是否正确。

另一个测试:

$ p $ kent @ KentT60:〜/ Desktop $ stat wxPython-demo-2.8.10.1 .tar.bz2
文件:`wxPython-demo-2.8.10.1.tar.bz2'
大小:3605257块:7056 IO块:4096常规文件
设备:80ah / 2058d Inode: (0644 / -rw-r - r--)Uid:(1000 / kent)Gid:(1000 / kent)
访问:2009-08-12 21: 45:45.000000000 +0200
修改:2009-08-12 21:43:46.0​​00000000 +0200
更改:2009-08-12 21:43:46.0​​00000000 +0200


3605257/512 = 7041.xx = 7042

以上我的猜测是7042 + 6 = 7048,但统计结果显示7056。



另一个来自互联网的例子 http://www.computerhope.com/unix /stat.htm

 文件:`index.htm'
大小:17137块:40 IO块:8192常规文件
设备:8h / 8d inode:23161443链接:1
访问:(0644 / -rw-r - r--)Uid:(17433 / comphope)Gid :(32 / www)
访问:2007-04-03 09:20:18.000000000 -0600
修改:2007-04-01 23:13:05.000000000 -0600
更改:2007- 04-02 16:36:21.000000000 -0600

在这个例子中,FS块大小是8k。我猜这个Blocks的数字应该是16xN,但是它是40的。迷路了...

谁都可以解释一下,stat是如何计算块的?



谢谢!

解决方案

stat 命令行工具使用 stat / fstat 等函数,这些函数返回 stat 结构。 stat 结构的 st_blocks 成员返回:


在磁盘上实际分配的大小为512字节的物理块总数。这个字段没有为块特殊或字符特殊文件定义。


因此,对于您的电子邮件示例,大小为965块数8,表示8 * 512 = 4096字节物理分配到磁盘上。不是2的原因是磁盘上的文件系统没有以512为单位分配空间,显然以4096为单位分配空间(而且分配的单位可能会根据文件大小和文件系统的复杂程度而变化,例如ZFS支持不同的)

同样,对于wxPython的例子来说,它表示在磁盘上物理分配了7056 * 512字节或3612672字节。你得到这个想法。

IO块的大小是关于I / O操作的最佳单元大小的暗示 - 通常是单元分配物理磁盘。不要混淆IO块和 stat 用于指示物理大小的块;物理大小的块总是512字节。

更新根据评论:

就像我说的, code> st_blocks 是操作系统如何指示磁盘上的文件占用了多少空间。磁盘上的实际分配单位是文件系统的选择。例如,由于分配块的方式,ZFS可以具有可变大小的分配块,甚至在同一个文件中:文件开始时具有小的块大小,并且块大小持续增加,直到它达到一个特定的点。如果文件稍后被截断,则可能会保留旧的块大小。所以根据文件的历史记录,它可以有多个可能的块大小。所以给定一个文件大小并不总是很明显,为什么它有一个特定的物理尺寸。

具体的例子:在我的Solaris机器上,使用ZFS文件系统,我可以创建一个非常短的文件:

  $ echo foo>测试
$ stat测试
大小:4个块:2个IO块:512个常规文件
(不相关细节省略)

对于这个文件,小文件,2个块,物理磁盘使用率是1024.

  $ dd if = / dev / zero of = test2 bs = 8192 count = 4 
$ stat test2
大小:32768块:65 IO块:32768常规文件
test3
,并在编辑器中截取这个 test3 文件:

  $ cp test2 test3 
$ joe -hex test3
$ stat test3
大小:4块:65 IO块:32768常规文件

现在,这是一个4字节的文件 - 就像 test - 但由于ZFS文件系统分配空间的方式,它在磁盘上物理使用32.5K。文件越大,块大小越大,但文件越小,块大小越小。 (是的,这可能会导致大量浪费的空间,这取决于您在ZFS上执行的文件和文件操作的种类,这就是为什么它允许您在每个文件系统基础上设置最大块大小并动态更改它的原因。

希望您现在应该明白,文件大小和物理磁盘使用率之间不一定是简单的关系。即使在上面,也不清楚为什么需要32.5K字节来存储大小为32K的文件 - 看起来ZFS通常需要额外的512字节来存储自己的额外存储空间。也许它使用的存储用于校验,引用计数,事务状态 - 文件系统簿记。通过在指定的物理文件大小中包含这些附加内容,ZFS似乎不会误导用户文件的物理成本。这并不意味着在不知道底层文件系统实现的详细信息的情况下对计算进行逆向工程是微不足道的。


I am wondering how does stat command calculate the blocks of a file. I read the article, it says:

The value st_blocks gives the size of the file in 512-byte blocks. (This may be smaller than st_size/512 e.g. when the file has holes.) The value st_blksize gives the "preferred" blocksize for efficient file system I/O. (Writing to a file in smaller chunks may cause an inefficient read-modify-rewrite.)

but I cannot verify it on my test.

my file system is ext3.

the dumpe2fs -h /dev/sda3 shows:

...
First block: 0
Block size: 4096
Fragment size: 4096
...

then I run

kent@KentT60:~/Desktop$ stat Email
File: `Email'
Size: 965 Blocks: 8 IO Block: 4096 regular file
Device: 80ah/2058d Inode: 746095 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ kent) Gid: ( 1000/ kent)
Access: 2009-08-11 21:36:36.000000000 +0200
Modify: 2009-08-11 21:36:35.000000000 +0200
Change: 2009-08-11 21:36:35.000000000 +0200

If Blocks here means:how many 512bytes blocks, the number should be 2 not 8. I thought that, the blocksize from filesystem (io block) is 4k. If fs will get the file Email, it will fetch minimal 4k from disk (8 x 512bytes blocks), which means 965/512 + 6 = 8. I am not sure if the guess is correct.

another test:

kent@KentT60:~/Desktop$ stat wxPython-demo-2.8.10.1.tar.bz2
File: `wxPython-demo-2.8.10.1.tar.bz2'
Size: 3605257 Blocks: 7056 IO Block: 4096 regular file
Device: 80ah/2058d Inode: 746210 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ kent) Gid: ( 1000/ kent)
Access: 2009-08-12 21:45:45.000000000 +0200
Modify: 2009-08-12 21:43:46.000000000 +0200
Change: 2009-08-12 21:43:46.000000000 +0200


3605257/512=7041.xx = 7042

following my guess above, this would be 7042 + 6 = 7048. but the stat result shows 7056.

And another example from internet at http://www.computerhope.com/unix/stat.htm. I copy the example at bottom of the page here:

File: `index.htm'
Size: 17137 Blocks: 40 IO Block: 8192 regular file
Device: 8h/8d Inode: 23161443 Links: 1
Access: (0644/-rw-r--r--) Uid: (17433/comphope) Gid: ( 32/ www)
Access: 2007-04-03 09:20:18.000000000 -0600
Modify: 2007-04-01 23:13:05.000000000 -0600
Change: 2007-04-02 16:36:21.000000000 -0600

In this example, FS blocksize is 8k. I suppose the Blocks number should be 16xN, but it is 40. get lost...

anyone can explain, how the stat calculate the Blocks ?

Thanks!

解决方案

The stat command-line tool uses the stat / fstat etc. functions, which return data in the stat structure. The st_blocks member of the stat structure returns:

The total number of physical blocks of size 512 bytes actually allocated on disk. This field is not defined for block special or character special files.

So for your "Email" example, with a size of 965 and block-count of 8, it is indicating that 8*512=4096 bytes are physically allocated on disk. The reason it's not 2 is that the file system on disk does not allocate space in units of 512, it evidently allocates them in units of 4096. (And the unit of allocation may vary depending on file size and filesystem sophistication. E.g. ZFS supports different units of allocation.)

Similarly, for wxPython example, it indicates that 7056*512 bytes, or 3612672 bytes are physically allocated on disk. You get the idea.

The IO block size is "a hint as to the 'best' unit size for I/O operations" - it's usually the unit of allocation on the physical disk. Don't get confused between the IO block and the block that stat uses to indicate physical size; the blocks for physical size are always 512 bytes.

Update based on comment:

Like I said, st_blocks is how the OS indicates how much space is used by the file on disk. The actual units of allocation on disk are the choice of the file system. For example, ZFS can have allocation blocks of variable size, even in the same file, because of the way it allocates blocks: files start out having a small block size, and block sizes keeps on increasing until it reaches a particular point. If the file is later truncated, it will probably keep the old block size. So based on the history of the file, it can have multiple possible block sizes. So given a file size it is not always obvious why it has a particular physical size.

Concrete example: on my Solaris box, with a ZFS file system, I can create a very short file:

$ echo foo > test
$ stat test
  Size: 4               Blocks: 2          IO Block: 512    regular file
(irrelevant details omitted)

OK, small file, 2 blocks, physical disk usage is 1024 for this file.

$ dd if=/dev/zero of=test2 bs=8192 count=4
$ stat test2
  Size: 32768           Blocks: 65         IO Block: 32768  regular file

OK, now we see physical disk usage of 32.5K, and an IO block size of 32K. I then copied it to test3 and truncated this test3 file in an editor:

$ cp test2 test3
$ joe -hex test3
$ stat test3
  Size: 4               Blocks: 65         IO Block: 32768  regular file

Well now, here's a file with 4 bytes in it - just like test - but it's using 32.5K physically on the disk, because of the way the ZFS file system allocates space. Block sizes increase as the file gets larger, but they don't decrease when the file gets smaller. (And yes, this can lead to substantial wasted space depending on the kinds of files and file operations you do on ZFS, which is why it allows you to set the maximum block size on a per-filesystem basis, and change it dynamically.)

Hopefully you should now appreciate that there isn't necessarily a simple relationship between file size and physical disk usage. Even in the above it's not clear why 32.5K bytes are needed to store a file that's exactly 32K in size - it appears that ZFS generally needs an extra 512 bytes for extra storage of its own. Perhaps it's using that storage for checksums, reference counts, transaction state - file system bookkeeping. By including these extras in the indicated physical file size, it seems like ZFS is trying not to mislead the user as to the physical costs of the file. That doesn't mean it's trivial to reverse-engineer the calculation without knowing intimate details about the underlying file system implementation.

这篇关于stat命令如何计算文件的块?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆