为什么我的Bash脚本添加了< feff>到文件的开头? [英] Why is my Bash script adding <feff> to the beginning of files?

查看:121
本文介绍了为什么我的Bash脚本添加了< feff>到文件的开头?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个脚本,该脚本使用sed清除.csv文件,删除了一些错误的逗号和引号(错误的意思是它们破坏了我们用来转换这些文件的内部程序):

I've written a script that cleans up .csv files, removing some bad commas and bad quotes (bad, means they break an in house program we use to transform these files) using sed:

# remove all commas, and re-insert the good commas using clean.sed
sed -f clean.sed $1 > $1.1st

# remove all quotes
sed 's/\"//g' $1.1st > $1.tmp

# add the good quotes around good commas
sed 's/\,/\"\,\"/g' $1.tmp > $1.tmp1

# add leading quotes
sed 's/^/\"/' $1.tmp1 > $1.tmp2

# add trailing quotes
sed 's/$/\"/' $1.tmp2 > $1.tmp3

# remove utf characters
sed 's/<feff>//' $1.tmp3 > $1.tmp4

# replace original file with new stripped version and delete .tmp files
cp -rf $1.tmp4 quotes_$1

这里很干净.

s/\",\"/XXX/g;
:a
s/,//g
ta
s/XXX/\",\"/g;

然后它会删除临时文件和中提琴,我们有了一个新文件,该文件以"quotes"一词开头,可用于其他进程.

Then it removes the temp files and viola we have a new file that starts with the word "quotes" that we can use for our other processes.

我的问题是:
为什么我必须做出sed语句才能删除该临时文件中的feff标签?原始文件没有该文件,但它始终显示在替换文件中.起初我以为是cp引起的,但是如果我在cp之前放入sed语句删除它,那么它就不存在了.

My question is:
Why do I have to make a sed statement to remove the feff tag in that temp file? The original file doesn't have it, but it always appears in the replacement. At first I thought cp was causing this but if I put in the sed statement to remove before the cp, it isn't there.

也许我只是想念一些东西...

Maybe I'm just missing something...

推荐答案

U + FEFF是字节顺序标记.您的文件很可能包含保存在UTF-16中的数据,并且BOM被清理过程"破坏了,这很可能是ASCII码.删除BOM可能不是一个好主意,而是要修复脚本以使其一开始就不会损坏它.

U+FEFF is the code point for a byte order mark. Your files most likely contain data saved in UTF-16 and the BOM has been corrupted by your 'cleaning process' which is most likely expecting ASCII. It's probably not a good idea to remove the BOM, but instead to fix your scripts to not corrupt it in the first place.

这篇关于为什么我的Bash脚本添加了&lt; feff&gt;到文件的开头?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆