如何"(头,尾)<文件"工作? [英] How does "(head; tail) < file" work?

查看:125
本文介绍了如何"(头,尾)<文件"工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(通过 http://stackoverflow.com/a/8624829/23582

如何(头​​,尾)<文件工作?需要注意的是猫文件| (头,尾)

How does (head; tail) < file work? Note that cat file | (head;tail) doesn't.

此外,为什么(头​​; WC -l)LT;文件给予 0 的输出WC

请注意:我了解的头部和尾部的工作。只是没有涉及到这些特定调用的细微之处。

Note: I understand how head and tail work. Just not the subtleties involved with these particular invocations.

推荐答案

有关OS X,你可以看一下源$ C ​​$ C为 源$ C ​​$ C为 找出一些什么事情的。在的情况下,你会想看看 forward.c

OS X

For OS X, you can look at the source code for head and the source code for tail to figure out some of what's going on. In the case of tail, you'll want to look at forward.c.

所以,事实证明,没有做什么特别的事情。它只是读取使用 STDIO 库的输入,所以它每次读取一个缓冲区,并可能会读太多。这意味着猫文件| (头,尾)不会最后10行,其中的缓存使得它看了一些小文件(或全部)工作

So, it turns out that head doesn't do anything special. It just reads its input using the stdio library, so it reads a buffer at a time and might read too much. This means cat file | (head; tail) won't work for small files where head's buffering makes it read some (or all) of the last 10 lines.

在另一方面,检查其输入文件的类型。如果是一个普通的文件,旨在结束,向后读取,直到它找到足够的线发射。这就是为什么(头​​,尾)&LT;文件适用于任何普通文件,无论大小。

On the other hand, tail checks the type of its input file. If it's a regular file, tail seeks to the end and reads backwards until it finds enough lines to emit. This is why (head; tail) < file works on any regular file, regardless of size.

您可以看一下源在Linux上也一样,但它更容易只使用 strace的,就像这样:

You could look at the source for head and tail on Linux too, but it's easier to just use strace, like this:

(strace -o /tmp/head.trace head; strace -o /tmp/tail.trace tail) < file

看看 /tmp/head.trace 。你会看到命令尝试从标准输入(文件描述符0)阅读填充缓冲液(在我的测试8192字节)。视文件,它可以或可以不填充缓冲区的大小。无论如何,让我们假设它读取的第一次读10行。然后,它使用 lseek的来文件描述符备份到第10行结束后,基本上是unreading它读取任何额外的字节。这工作,因为文件描述符是正常的,搜索文件打开。因此,(头​​,尾)&LT;文件将任何搜索文件的工作,但它不会让猫文件| (头,尾)。工作

Take a look at /tmp/head.trace. You'll see that the head command tries to fill a buffer (of 8192 bytes in my test) by reading from standard input (file descriptor 0). Depending on the size of file, it may or may not fill the buffer. Anyway, let's assume that it reads 10 lines in that first read. Then, it uses lseek to back up the file descriptor to the end of the 10th line, essentially "unreading" any extra bytes it read. This works because the file descriptor is open on a normal, seekable file. So (head; tail) < file will work for any seekable file, but it won't make cat file | (head; tail) work.

在另一方面,做的的(在我的测试),寻求结束和阅读倒退,像它在OS X ,至少,它不读取一路回到文件的开头。

On the other hand, tail does not (in my testing) seek to the end and read backwards, like it does on OS X. At least, it doesn't read all the way back to the beginning of the file.

下面是我的测试。创建一个小的,12行输入文件:

Here's my test. Create a small, 12-line input file:

yes | head -12 | cat -n > /tmp/file

然后,尝试(头​​,尾)&LT; / tmp目录/文件在Linux上。我得到这个与GNU的coreutils 5.97:

Then, try (head; tail) < /tmp/file on Linux. I get this with GNU coreutils 5.97:

     1  y
     2  y
     3  y
     4  y
     5  y
     6  y
     7  y
     8  y
     9  y
    10  y
    11  y
    12  y

不过,在OS X上,我得到这样的:

But on OS X, I get this:

     1  y
     2  y
     3  y
     4  y
     5  y
     6  y
     7  y
     8  y
     9  y
    10  y
     3  y
     4  y
     5  y
     6  y
     7  y
     8  y
     9  y
    10  y
    11  y
    12  y

这篇关于如何&QUOT;(头,尾)&LT;文件&QUOT;工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆