如何使用awk命令从多个文件中查找文件并删除正文中包含某些字符串的文件? [英] How to find a file and delete containing some string in the body using awk command from multiple files?

查看:435
本文介绍了如何使用awk命令从多个文件中查找文件并删除正文中包含某些字符串的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在文件目录Record中有多个文件:

I have multiple files in the file directory Record:

Record
   1.txt
   2.txt
   3.txt

文件2.txt在第一行的第二列中包含字符串abcd.如何打印2.txt中的所有内容? 如何删除2.txt文件?

The file 2.txt contains a string abcd in the second column of the first line. How can I print all the contents in 2.txt? How can I delete the file 2.txt?

我使用awk打印该文件中的所有内容,但仅打印该行.

I used awk to print all the contents in that file but it only prints that line.

我使用find命令将文件名存储在file.txt文件夹中,但这给我一个错误.

I used find command to store the file name in the file.txt folder but it gives me an error.

rm -rf Record
mkdir Record
cd Record
echo f1
touch 1.txt
echo author: efg   > 1.txt
echo title: hijk  >> 1.txt
echo pages: 1990  >> 1.txt
echo year: 1890  >> 1.txt
touch 2.txt
echo author: abcd > 2.txt
echo author: lmno >> 2.txt
echo title: pqrs >> 2.txt
echo pages: 354 >> 2.txt
echo year:  1970 >> 2.txt
touch 3.txt
echo author: aklj > 3.txt
echo title: dban  >> 3.txt
echo pages: 876  >> 3.txt
echo year: 1860  >> 3.txt
cd ..
adress=./Record/*.txt
sfind=abcd
  awk ' BEGIN { sfind = ENVIRON["sfind"] }
    FNR == 1 { secondPass = seen[FILENAME]++ }
    secondPass { print FILENAME, $0; next }
    index($2,sfind) {
        ARGV[ARGC] = FILENAME
        ARGC++
        nextfile       
    }
'
$adress

推荐答案

sfind='abcd' awk '
    BEGIN { sfind = ENVIRON["sfind"] }
    FNR == 1 { secondPass = seen[FILENAME]++ }
    secondPass { print FILENAME, $0; next }
    index($2,sfind) {
        ARGV[ARGC++] = FILENAME
        nextfile        # for efficiency if using GNU gawk.
    }
' ./Record/*.txt

上面的代码对输入文件进行了2次传递-第一步是识别包含存储在$2中的sfind中的字符串值的文件,并将它们加回到ARGV []中,以便稍后再处理,第二个打印第一遍中标识的那些文件的内容.如果您不想在每个输出行的开头打印输入文件名,则只需将print FILENAME, $0更改为print.

The above makes 2 passes of the input files - the first pass to identify those that contain the value of the string stored in sfind in $2 and add them back into the and of ARGV[] so they'll be processed again later, the second to print the contents of those files identified on the first pass. If you don't want the input file name printed at the start of each output line then just change print FILENAME, $0 to print.

以上内容适用于任意数量的文件中的任意数量的匹配项(0、1、2等),对于任何文件名,即使它们包含空格,通配符等,也适用于sfind包括反斜杠转义符和.*等正则表达式元字符.

The above will work for any number of matches in any number of files (0, 1, 2, whatever), for any file names, even if they contain spaces, globbing characters, etc., and for any characters in sfind including backslash escapes and regexp metcharacters like . or *.

上面做了部分字符串匹配.这是您的选择:

The above does partial string matching. Here are your options:

  • 部分字符串:index($2,sfind)(如图所示)
  • 完整字段字符串:$2 == sfind
  • 部分正则表达式:$2 ~ sfind
  • 全字段正则表达式:$2 ~ ("^" sfind "$")
  • Partial string: index($2,sfind) (as shown)
  • Full field string: $2 == sfind
  • Partial regexp: $2 ~ sfind
  • Full field regexp: $2 ~ ("^" sfind "$")

全字匹配变得更加棘手,取决于您对字"的定义,并且可以由特定于实现的构造提供服务,因此除非您需要它,否则我将忽略它.

Full word matching gets trickier, depends on your definition of a "word", and can be served by implementation-specific constructs so I'll leave that out unless you need it.

这篇关于如何使用awk命令从多个文件中查找文件并删除正文中包含某些字符串的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆