用伪值替换定界文本文件中的空字段 [英] Replacing empty fields in delimited text file with dummy value

查看:77
本文介绍了用伪值替换定界文本文件中的空字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个项目中使用一组定界数据,格式为:

I am working on a project that takes a delimited set of data of the form:

field1~field2~field3~.....~fieldn

可能有空字段,所以

field1~~~field4~~field6

完全可以接受.

使用内部翻译程序翻译此文件,这需要一点点改进.具体来说,它不能很好地处理空字段.我的解决方案是在其中添加一些虚拟值,例如空格或@符号.我尝试过:

This file gets translated using an inhouse translator program that leaves a little to be desired. Specifically, it doesn't deal with empty fields well. My solution was to stick some dummy value in there, like a space or an @ sign. I've tried:

sed -r 's/~/~ ~/g'

awk '{gsub(/\~\~/,"~ ~")}; 1' file > file.SPACE

,但是在替换MULTIPLE字段时,这两个都不够.因此,如果我输入

but both of these fall short in replacing MULTIPLE fields. So if I input

field1~field2~~~field3

它将输出:

field1~field2~ ~~field3

如果可以的话,我只想编写脚本,因为我不能更改翻译器的代码.我可以在创建分隔文件的程序中更改代码,但我不希望这样.是否有一些解决方法,或者为此提出了一个表达,这只是普通语言固有的局限性之一?

I'd like to just script this if I could, as I can't change the code of the translator. I can change the code in the program that creates the delimited file, but I'd rather not. Is there some workaround, or is coming up with an expression for this just one of the inherent limitations in a regular language?

编辑:哇,谢谢大家的快速反应,您的所有解决方案都奏效了,因此我对所有解决方案进行了投票.我想我会接受Janito的解释.

Wow thanks for the quick response everyone, all your solutions worked so I upvoted all of them. I think I'm going to accept Janito's because of the explanation.

为什么还要投票?

推荐答案

您可以尝试:

sed -e ':a;s/~~/~ ~/;ta'

这将使用:"命令创建一个标签"a",然后将一个出现的~~替换为~ ~,然后使用"t"测试命令跳回到"a"标签(如果上一个替代命令成功.

This creates a label "a" with the ":" command, then replaces one occurrance of ~~ with ~ ~, and then uses the "t" test command to jump back to the "a" label if the previous substitute command succeeded.

希望这对您有帮助=)

这篇关于用伪值替换定界文本文件中的空字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆