AWK整体读取文件 [英] Awk to read file as a whole

查看:171
本文介绍了AWK整体读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让文件内容符合-

abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

通常,如果执行任何使用awk的操作,则会逐行迭代并在每一行上执行该操作.

In general if any operation using awk is performed, it iterates line by line and performs that action on each line.

例如:

awk '{print substr($0,8,10)}' file

O/P:

hijklmn
wxyzabc
klmnopq

我想知道一种方法,其中文件内的所有内容都被视为单个变量,而awk仅输出一个输出.

I would like to know an approach in which all the contents inside the file is treated as a single variable and awk prints just one output.

所需的O/P示例:

hijklmnpqr

这并不是我希望为给定问题提供所需的输出,但是总的来说,如果有人可以提出一种向awk提供整个文件内容的方法,我将不胜感激.

It's not that I wish for the desired output for the given question but in general would appreciate if anyone could suggest an approach to provide the content of a file as a whole to the awk.

推荐答案

这是gawk解决方案

文档:

This is a gawk solution

From the docs:

有时您可能希望将整个数据文件作为单个记录进行处理. 实现此目标的唯一方法是为RS提供一个您知道在输入文件中未出现的值. 通常很难做到这一点,以使程序始终可用于任意输入文件.

There are times when you might want to treat an entire data file as a single record. The only way to make this happen is to give RS a value that you know doesn’t occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files.


$ cat file
abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

必须将RS设置为存档中不存在的 pattern ,遵循

The RS must be set to a pattern not present in archive, following Denis Shirokov suggestion on the docs (Thanks @EdMorton):

$ gawk '{print ">>>"$0"<<<<"}' RS='^$' file
>>>abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq

abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
<<<<

技巧 粗体:

通过将RS设置为^$(一个永远不会出现的正则表达式)来工作 如果文件包含内容,则匹配. gawk将文件中的数据读取到 tmp,尝试匹配RS. 匹配项在每次读取后都会失败,但是很快就会失败,以至于gawk用文件的全部内容填充了tmp

It works by setting RS to ^$, a regular expression that will never match if the file has contents. gawk reads data from the file into tmp, attempting to match RS. The match fails after each read, but fails quickly, such that gawk fills tmp with the entire contents of the file


所以:


So:

$ gawk '{gsub(/\n/,"");print substr($0,8,10)}' RS='^$' file

返回:

hijklmnpqr

这篇关于AWK整体读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆