如何通过sed从文件的开头/结尾修剪连续的空格 [英] how to trim consecutive whitespace from beginning/end of file via sed

查看:29
本文介绍了如何通过sed从文件的开头/结尾修剪连续的空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用sed,如何从文件的开头和/或结尾修剪一个或多个连续 纯空白行?(仅空白"是指不包含任何非空白字符的行,即空白或仅包含空白字符的行.)

Using sed, how can I trim one or more consecutive whitespace-only lines from the beginning and/or end of a file? (By "whitespace-only", I mean lines which do not contain any non-whitespace characters, i.e. lines which are either blank or only include whitespace characters.)

例如,如果我的文件是:

For example if my file is:

<blank line>
<line only containing some space/tab characters>
<blank line>
foo
bar
<tab character>
baz
<space character>
<space character><tab character>
qux
<tab character>

那么所需的输出将是:

foo
bar
<tab character>
baz
<space character>
<space character><tab character>
qux

如果必须在单独的 sed 调用中完成对文件开头和结尾的修剪,那没关系,尽管我也对在一次调用中管理所有内容的解决方案感兴趣.

If trimming from the beginning and end of the file have to be done in separate sed invocations, that's OK, although I'd also be interested in solutions which manage it all within one invocation.

附言这在 Perl/Ruby 等中很容易,但我特别想知道在 sed 中是否可行.谢谢!

P.S. This is easy in Perl / Ruby etc., but I'd specifically like to know if it's possible in sed. Thanks!

推荐答案

我还没有看到任何真正的 sed 专家提出解决方案,所以这是我的尝试(GNU sed 特定于 \S\s - 分别替换为 [^[:space:]][[:space:]] 用于 POSIX):

I don't see any real sed experts popping up with a solution yet so here's my attempt (GNU sed specific due to \S and \s - replace with [^[:space:]] and [[:space:]] respectively for POSIX):

$ sed -e '/\S/,$!d' -e :a -e '/^\s*$/{$d;N;ba' -e '}' file
foo
bar

baz


qux

如果有人想看到一种明智的方法来与最终调用的任何神秘的 sed 咒语进行比较,这里是使用 GNU awk 进行多字符 RS\s [[:space:]] 的缩写:

And in case anyone wants to see a sensible approach to compare to whatever arcane sed incantation IS eventually invoked, here's one way using GNU awk for multi-char RS and \s abbreviation for [[:space:]]:

$ awk -v RS='^$' '{gsub(/^\s+|\s+$/,"")}1' file
foo
bar

baz


qux

POSIX 等效,如果您很高兴选择一些您知道不能在输入中的控制字符(例如,使用 ^C = 文字 control-C 字符):

POSIX equivalent if you're happy picking some control char you know can't be in your input (e.g. using ^C = a literal control-C char):

awk -v RS='^C' '{gsub(/^[[:space:]]+|[[:space:]]+$/,"")}1' file

否则:

awk '{rec=rec $0 RS} END{gsub(/^[[:space:]]+|[[:space:]]+$/,"",rec); print rec}' file

或者如果您的内存有限并且无法一次读取整个文件,您需要通过 2 次来确定最后一个非空白行的位置,例如:

or if you are limited in memory and cant read the whole file at once you need 2 passes to identify where the last non-blank line is, e.g.:

awk 'NR==FNR{if(NF){if(!beg)beg=NR; end=NR}; next} (FNR>=beg)&&(FNR<=end)' file file

或者您需要缓冲空行(在它们的初始集合之后),直到遇到非空行,然后在当前行之前打印该缓冲区:

or you need to buffer the blank lines (after the initial set of them) until you hit a non-blank line and then print that buffer before the current line:

awk 'NF{printf "%s%s\n",buf,$0; buf=""; f=1; next} f{buf = buf $0 RS}' file

这篇关于如何通过sed从文件的开头/结尾修剪连续的空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆