如何用ASCII替换Unicode字符 [英] How to replace Unicode characters with ASCII
问题描述
我有以下命令将Unicode字符替换为ASCII字符.
I have the following command to replace Unicode characters with ASCII ones.
sed -i 's/Ã/A/g'
问题是在我的Unix环境中,sed命令无法识别Ã
,因此我假设您将其替换为十六进制值.如果我改用 C3
,语法会是什么样?
The problem is Ã
isn't recognized by the sed command in my Unix environment so I'd assume you replace it with its hexadecimal value. What would the syntax look like if I were to use C3
instead?
我正在使用此命令作为其他字符的模板,我想用空格替换这些字符,例如:
I'm using this command as a template for other characters i'd like to replace with blank spaces such as:
sed -i's/©//g'
sed -i 's/©/ /g'
推荐答案
可以在"sed"中使用十六进制值.
It is possible to use hex values in "sed".
echo "Ã" | hexdump -C
00000000 c3 83 0a |...|
00000003
好的,那个字符是两个字节的组合"c3 83".让我们用单字节"A"代替它:
Ok, that character is two byte combination "c3 83". Let's replace it with single byte "A":
echo "Ã" |sed 's/\xc3\x83/A/g'
A
说明:\ x表示"sed"后面跟随有十六进制代码.
Explanation: \x indicates for "sed" that a hex code follows.
这篇关于如何用ASCII替换Unicode字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!