如何使用Unix命令在文本文件中将ASCII NULL(NUL)转换为单个间距? [英] How to convert a ASCII NULL (NUL) into single spacing in a text file using Unix command?

查看:133
本文介绍了如何使用Unix命令在文本文件中将ASCII NULL(NUL)转换为单个间距?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我BCP sql服务器中的数据时,

When I BCP the data in sql server

在输出文件中,我在输出文件中得到一个类似NUL的字符,我想用单个空格代替它.

In the output file I am getting a NUL like character in the output file, and i want to replace this with the single blank space.

当我使用下面的sed命令时,它将删除NUL字符,但是在这两个定界符之间,我们没有空格.

When I used the below sed command it removes the NUL character but between those 2 delimiter we don't have single space.

sed's/\ x0//g'输出文件名

sed 's/\x0/ /g' output file name

示例:sed命令执行后,我将得到如下输出文件

Example: After sed command i am getting output file like below

PHMO||P00000005233
PHMO||P00000005752

但是我需要在这些定界符之间添加一个空格

But i need a single spacing in between those delimiter as

PHMO| |P00000005233
PHMO| |P00000005752

推荐答案

通常的方法是使用tr.但是,带有trsed的解决方案不是可移植的. (该问题被标记为"unix",因此只有便携式解决方案才有意义).

The usual approach to this would be using tr. However, solutions with tr and sed are not portable. (The question is tagged "unix", so only portable solutions are interesting).

这是一个简单的演示脚本

Here is a simple demo script

#!/bin/sh
date
tr '\000' ' ' <$0.in
date
sed -e 's/\x00/ /g' <$0.in

我命名为foo

及其输入(此处的ASCII NUL显示为^@):

this is a null: "^@"

与GNU trsed一起运行:

Running with GNU tr and sed:

Fri Apr  1 04:41:15 EDT 2016
this is a null: " "
Fri Apr  1 04:41:15 EDT 2016
this is a null: " "

使用OSX:

Fri Apr  1 04:41:53 EDT 2016
this is a null: " "
Fri Apr  1 04:41:53 EDT 2016
this is a null: "^@"

在Solaris 10(和11)中,尽管可能有近期更改):

With Solaris 10 (and 11, though there may be a recent change):

Fri Apr  1 04:38:08 EDT 2016
this is a null: ""
Fri Apr  1 04:38:08 EDT 2016
this is a null: ""

请记住,sed是面向行的,而ASCII NUL被认为是二进制( -line)字符.如果您需要便携式解决方案,那么其他工具(如Perl)(没有此限制)很有用.在这种情况下,可以将其添加到脚本中:

Bear in mind that sed is line-oriented, and that ASCII NUL is considered a binary (non-line) character. If you want a portable solution, then other tools such as Perl (which do not have that limitation) are useful. For that case one could add this to the script:

perl -np -e 's/\0/ /g' <$0.in

在这种情况下,中间工具awk并不更好.通过以下几行再次进入Solaris:

The intermediate tool awk is no better in this instance. Going to Solaris again, with these lines:

for awk in awk nawk mawk gawk
do
echo "** $awk:"
$awk '{ gsub("\0"," "); print; }' <$0.in
done

我看到以下输出:

** awk:
awk: syntax error near line 1
awk: illegal statement near line 1
** nawk:
nawk: empty regular expression
 source line number 1
 context is
        { gsub("\0"," >>>  ") <<<
** mawk:
this is a null: " "
** gawk:
this is a null: " "

进一步阅读:

  • sed - stream editor (POSIX)
  • tr - translate characters (POSIX), which notes

与某些历史实现不同,tr实用程序的此定义正确处理了其输入流中的NUL字符.可以使用以下方法去除NUL字符:

Unlike some historical implementations, this definition of the tr utility correctly processes NUL characters in its input stream. NUL characters can be stripped by using:

tr -d '\000'

  • perlrun-如何执行Perl解释器
    • perlrun - how to execute the Perl interpreter
    • 这篇关于如何使用Unix命令在文本文件中将ASCII NULL(NUL)转换为单个间距?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆