如何跳过AWK目录? [英] How to skip a directory in awk?

查看:482
本文介绍了如何跳过AWK目录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有文件和目录的结构如下:

Say I have the following structure of files and directories:

$ tree
.
├── a
├── b
└── dir
    └── c

1 directory, 3 files

这就是两个文件 A B 用DIR 一起DIR ,其中另一个文件 C 表示。

That is, two files a and b together with a dir dir, where another file c stands.

我要处理所有与 AWK 文件( GNU awk的4.1.1 ,完全一致),所以我做这样的事情:

I want to process all the files with awk (GNU Awk 4.1.1, exactly), so I do something like this:

$ gawk '{print FILENAME; nextfile}' * */*
a
b
awk: cmd. line:1: warning: command line argument `dir' is a directory: skipped
dir/c

一切都很好,但 * 也扩大到目录 DIR AWK 尝试处理它。

All is fine but the * also expands to the directory dir and awk tries to process it.

所以我想:有没有原生的方式 AWK 可以检查给定的元素是文件还是没有,如果有,跳过它?也就是说,在不使用系统()吧。

So I wonder: is there any native way awk can check if the given element is a file or not and, if so, skip it? That is, without using system() for it.

我把它通过调用外部系统中的 BEGINFILE

I made it work by calling the external system in BEGINFILE:

$ gawk 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME, "is a dir, skipping"; nextfile}} ENDFILE{print FILENAME, FNR}' * */*
a
a 10
a.wk
a.wk 3
b
b 10
dir
dir is a dir, skipping
dir/c
dir/c 10

<子>还要注意一个事实,即如果(系统([-d!文件名])){打印文件名,是一个目录,跳过下一文件} 工作与直觉相反:它应该返回1时如此,但它返回的退出code

Note also the fact that if (system(" [ ! -d " FILENAME " ]")) {print FILENAME, "is a dir, skipping"; nextfile} works counter intuitively: it should return 1 when true, but it returns the exit code.

我读了 A.5扩展在呆子不属于POSIX AWK


      
  • 在命令行上目录产生一个警告,跳过(见<一href=\"https://www.gnu.org/software/gawk/manual/html_node/Command_002dline-directories.html#Command_002dline-directories\"相对=nofollow>命令行目录)

  •   
  • Directories on the command line produce a warning and are skipped (see Command-line directories)

和则链接的页面说:

4.11目录中的命令行

4.11 Directories on the Command Line

根据POSIX标准,命名为awk命令行上的文件
  必须是文本文件;这是一个致命的错误,如果他们不。大多数版本
  AWK的命令行作为一个致命的错误处理上的一个目录。

According to the POSIX standard, files named on the awk command line must be text files; it is a fatal error if they are not. Most versions of awk treat a directory on the command line as a fatal error.

在默认情况下,GAWK产生一个警告有关命令的目录
  线,但在其他忽略它。这使得它更容易使用的shell
  与你的awk程序通配符:

By default, gawk produces a warning for a directory on the command line, but otherwise ignores it. This makes it easier to use shell wildcards with your awk program:

$ gawk -f whizprog.awk *        Directories could kill this program

如果任一--posix或--traditional选项给出,那么呆子
  恢复到在命令行上治疗目录作为一个致命的错误。

If either of the --posix or --traditional options is given, then gawk reverts to treating a directory on the command line as a fatal error.

请参阅扩展样品READDIR ,换一种方式对待目录为可用
  从awk程序的数据。

See Extension Sample Readdir, for a way to treat directories as usable data from an awk program.

而事实上也正是如此:像以前相同的命令与 - POSIX 失败:

And in fact it is the case: the same command as before with --posix fails:

$ gawk --posix 'BEGINFILE{print FILENAME; if (system(" [ ! -d " FILENAME " ]")) {print FILENAME, "is a dir, skipping"; nextfile}} ENDFILE{print FILENAME, NR}' * */*
gawk: cmd. line:1: fatal: cannot open file `dir' for reading (Is a directory)

我检查上面链接的 16.7.6阅读目录部分和他们谈 READDIR

该READDIR扩展增加了对目录输入解析器。用法
  如下:

The readdir extension adds an input parser for directories. The usage is as follows:

@loadREADDIR

@load "readdir"

但我不知道如何既不调用它也没怎么在命令行中使用它。

But I am not sure neither how to call it nor how to use it from the command line.

推荐答案

如果你想保护其他人错误地传递一个目录(或其他任何东西,这不是一个可读的文本文件),它的脚本,你可以这样做:

If you wanted to safeguard your script from other people mistakenly passing a directory (or anything else that's not a readable text file) to it, you could do this:

$ ls -F tmp
bar  dir/  foo

$ cat tmp/foo
line 1

$ cat tmp/bar
line 1
line 2

$ cat tmp/dir
cat: tmp/dir: Is a directory

$ cat tst.awk
BEGIN {
    for (i=1;i<ARGC;i++) {
        if ( (getline line < ARGV[i]) <= 0 ) {
            print "Skipping:", ARGV[i], ERRNO
            delete ARGV[i]
        }
        close(ARGV[i])
    }
}
{ print FILENAME, $0 }

$ awk -f tst.awk tmp/*
Skipping: tmp/dir Is a directory
tmp/bar line 1
tmp/bar line 2
tmp/foo line 1

$ awk --posix -f tst.awk tmp/*
Skipping: tmp/dir
tmp/bar line 1
tmp/bar line 2
tmp/foo line 1

每POSIX 函数getline 收益 1 如果/当失败尝试从一个文件中的记录(如:不可读文件或文件不存在或文件是一个目录),你只需要GNU awk来告诉你哪些这些故障是由 ERRNO 如果你在乎的价值。

Per POSIX getline returns -1 if/when it fails trying to retrieve a record from a file (e.g. unreadable file or file does not exist or file is a directory), you just need GNU awk to tell you which of those failures it was by the value of ERRNO if you care.

这篇关于如何跳过AWK目录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆