高效的UTF-8编码方式 [英] Efficient way to ASCII encode UTF-8

查看:145
本文介绍了高效的UTF-8编码方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种简单有效的方式来存储UTF-8字符串在ASCII-7。有效的是我的意思是:

I'm looking for a simple and efficient way to store UTF-8 strings in ASCII-7. With efficient I mean the following:


  • 输入中的所有ASCII字母数字字符在输出中应保留相同的ASCII字母数字字符

  • 所得到的字符串应尽可能短

  • 操作需要可逆,没有任何数据丢失

  • ASCII字符串不区分大小写

  • 输入长度应不受限制

  • 应允许整个UTF-8范围

  • all ASCII alphanumeric chars in the input should stay the same ASCII alphanumeric chars in the output
  • the resulting string should be as short as possible
  • the operation needs to be reversable without any data loss
  • the resulting ASCII string should be case insensitive
  • there should be no restriction on the input length
  • the whole UTF-8 range should be allowed

我的第一个想法是使用Punycode(IDNA),因为它符合前四个要求,但在最后两个要素失败。

My first idea was to use Punycode (IDNA) as it fits the first four requirements, but it fails at the last two.

任何人都可以推荐一种替代编码方案?更好的是,如果有一些代码可以查看。

Can anyone recommend an alternative encoding scheme? Even better if there's some code available to look at.

推荐答案

UTF-7 ,或稍微不太透明但更广泛, quoted-printable

UTF-7, or, slightly less transparent but more widespread, quoted-printable.


输入中的所有ASCII字符应保留ASCII字符输出

all ASCII chars in the input should stay ASCII chars in the output

(显然不完全可能,因为您至少需要一个字符作为逃脱。 )

(Obviously not fully possible as you need at least one character to act as an escape.)

这篇关于高效的UTF-8编码方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆