管道连接到命令后尾随新行:有没有什么标准? [英] Trailing new line after piping to a command: is there any standard?

查看:152
本文介绍了管道连接到命令后尾随新行:有没有什么标准?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

剪切 删除最后一个字符CR我发现有些程序不添加尾部的新行到一个字符串的结尾,而有的则没有:

假设我们有字符串 foobar的的printf 打印出来,这样我们就不会获得额外的新行:

  $ printf的FOOBAR| OD -c
0000000˚FØ○B A R
0000006

,使用回声-n

  $回声-nFOOBAR| OD -c
0000000˚FØ○B A R
0000006

回声的默认行为是返回输出跟着一个新行,所以回声foobar的返回 foobar的\\ n )。

无论是 SED 也不加做任何额外的字符:

  $ printf的FOOBAR| SED的/./&安培/克'| OD -c
0000000˚FØ○B A R
0000006
$ printf的FOOBAR|猫 - | OD -c
0000000˚FØ○B A R
0000006

而这两个 AWK 剪切做的。此外的xargs 粘贴添加此尾随新行:

  $ printf的FOOBAR|切-b1- | OD -c
0000000˚F○○B A r \\ñ
0000007
$ printf的FOOBAR| awk的'1'| OD -c
0000000˚F○○B A r \\ñ
0000007
$ printf的FOOBAR| xargs的| OD -c
0000000˚F○○B A r \\ñ
0000007
$ printf的FOOBAR|贴| OD -c
0000000˚F○○B A r \\ñ
0000007

所以我想知道:这是为什么不同的行为?有什么建议POSIX这个?

请注意我运行所有这一切都在我的Bash 4.3.11剩下的是:


  • GNU awk的4.0.1

  • 的sed(GNU SED)4.2.2

  • 猫(GNU的coreutils)8.21

  • 切(GNU的coreutils)8.21

  • 的xargs(GNU的findutils)4.4.2

  • 粘贴(GNU的coreutils)8.21


解决方案

  

所以我想知道:这是为什么不同的行为?有什么建议POSIX这个?


一些指令(例如像的printf )的简单接口的libc 库调用(比如的printf())不自动添加 \\ n。大多数* NIX文本处理命令将在最后一行的末尾添加 \\ n

定义POSIXv7的,文本行必须有一个换行的结尾:


  

    

3.206行


  
  
  

零序列或多种非<换行方式> 字符加上终止字符


如果在换行丢失,就变成这样:


  

    

3.195未完成行


  
  
  

的序列的一个或多个非<换行方式> 在文件的结尾字符


总的想法是,文本文件可以作为一个记录列表,其中每个记录是由终止\\ n进行治疗。换句话说, \\ n 不是行之间的东西 - 这是该行的一部分。例如,见 与fgets() 功能: \\ n 总是被包括在内,用于标识文本行是否完全或无法读取的情况。如果最后一行缺少 \\ n ,然后一个人做更多的检查,以正确读取文件。<​​/ P>

在一般情况下,只要你的文本文件由* NIX的程序/脚本* NIX创建的,这是很好的期望,最后一行是正确终止。但是许多Java应用程序,以及在Windows应用程序不处理的正确或一致。他们不仅常常忘了添加最后 \\ n ,他们往往还错误地把尾随 \\ n 作为附加空行。

Answering How to remove the last CR char with cut I found out that some programs do add a trailing new line to the end of a string, while others don't:

Say we have the string foobar and print it with printf so that we don't get an extra new line:

$ printf "foobar" | od -c
0000000   f   o   o   b   a   r
0000006

Or with echo -n:

$ echo -n "foobar" | od -c
0000000   f   o   o   b   a   r
0000006

(echo's default behaviour is to return the output followed by a newline, so echo "foobar" returns f o o b a r \n).

Neither sed nor cat do add any extra character:

$ printf "foobar" | sed 's/./&/g' | od -c
0000000   f   o   o   b   a   r
0000006
$ printf "foobar" | cat - | od -c
0000000   f   o   o   b   a   r
0000006

Whereas both awk and cut do. Also xargs and paste add this trailing new line:

$ printf "foobar" | cut -b1- | od -c
0000000   f   o   o   b   a   r  \n
0000007
$ printf "foobar" | awk '1' | od -c
0000000   f   o   o   b   a   r  \n
0000007
$ printf "foobar" | xargs | od -c
0000000   f   o   o   b   a   r  \n
0000007
$ printf "foobar" | paste | od -c
0000000   f   o   o   b   a   r  \n
0000007

So I was wondering: why is this different behaviour? Is there anything POSIX suggests about this?

Note I am running all of this in my Bash 4.3.11 and the rest is:

  • GNU Awk 4.0.1
  • sed (GNU sed) 4.2.2
  • cat (GNU coreutils) 8.21
  • cut (GNU coreutils) 8.21
  • xargs (GNU findutils) 4.4.2
  • paste (GNU coreutils) 8.21

解决方案

So I was wondering: why is this different behaviour? Is there anything POSIX suggests about this?

Some commands (like for example printf) are simple interface to the libc library calls (e.g. printf()) which don't add \n automatically. Most *NIX text processing commands would add a \n on the end of the last line.

From the Definitions of POSIXv7, a textual line has to have a newline on the end:

3.206 Line

A sequence of zero or more non- <newline> characters plus a terminating character.

If the newline is missing, it becomes this:

3.195 Incomplete Line

A sequence of one or more non- <newline> characters at the end of the file.

The general idea is that text file can be treated as a list of records, where every record is terminated by \n. In other words, \n is not something between lines - it is the part of the line. See for example the fgets() function: the \n is always included and serves to identify the case whether the text line was read completely or not. If the last line is missing the \n, then one has to do more checks to read the file correctly.

In general, as long as your text files are created on *NIX by *NIX programs/scripts, it is fine to expect that last line is properly terminated. But many Java applications as well as the Windows applications do not handle that correctly or consistently. Not only they often forget to add the last \n, oftentimes they also incorrectly treat the trailing \n as an additional empty line.

这篇关于管道连接到命令后尾随新行:有没有什么标准?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆