如何将unicode字符串转换为bash中的转义符? [英] How do you convert unicode string to escapes in bash?

查看:116
本文介绍了如何将unicode字符串转换为bash中的转义符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个将Unicode字符串转换为转义字符的工具,如\u0230。

I need a tool that will translate the unicode string into escape characters like \u0230.

例如,

echo ãçé | convert-unicode-tool
\u00e3\u00e7\u00e9


推荐答案

所有bash方法-

echo ãçé |
   while read -n 1 u
   do [[ -n "$u" ]] && printf '\\u%04x' "'$u"
   done

领先的撇号是printf格式/解释指南。

That leading apostrophe is a printf formatting/interpretation guide.

来自在线GNU手册页


如果数字参数的前导字符为''或''',则其值是紧随其后的字符的数值。如果设置了POSIXLY_CORRECT环境变量,则将忽略所有其余字符;否则,将显示警告。 ,因为'a'的ASCII值为97,所以'printf%d a在使用ASCII字符集的主机上输出'97'。

If the leading character of a numeric argument is ‘"’ or ‘'’ then its value is the numeric value of the immediately following character. Any remaining characters are silently ignored if the POSIXLY_CORRECT environment variable is set; otherwise, a warning is printed. For example, ‘printf "%d" "'a"’ outputs ‘97’ on hosts that use the ASCII character set, since ‘a’ has the numeric value 97 in ASCII.

这使我们可以将字符传递给printf以进行数字解释,例如%d或%03o,或此处的%04x。

That lets us pass the character to printf for numeric interpretations such as %d or %03o, or here, %04x.

[[-n $ u]] 是因为存在一个空尾字节否则将附加为 \u0000

The [[ -n "$u" ]] is because there's a null trailing byte that will otherwise be appended as \u0000.

输出:

$:     echo ãçé |
>        while read -n 1 u
>        do [[ -n "$u" ]] && printf '\\u%04x' "'$u"
>        done
\u00e3\u00e7\u00e9

无空字节检查-

$: echo ãçé | while read -n 1 u; do printf '\\u%04x' "'$u";done
\u00e3\u00e7\u00e9\u0000

这篇关于如何将unicode字符串转换为bash中的转义符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆