如何编写对UTF-8安全的代码? [英] How Do You Write Code That Is Safe for UTF-8?

查看:206
本文介绍了如何编写对UTF-8安全的代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一组为ASCII字符集开发的应用程序。现在,我们正试图在冰岛安装它,并且遇到冰岛字符陷入困境的问题。

We have a set of applications that were developed for the ASCII character set. Now, we're trying to install it in Iceland, and are running into problems where the Icelandic characters are getting screwed up.

我们正在解决我们的问题,但我想知道:是否有一个很好的指南,写的C ++代码,是为8位字符设计的,并将正常工作,当UTF-8数据给它?

We are working through our issues, but I was wondering: Is there a good "guide" out there for writing C++ code that is designed for 8-bit characters and which will work properly when UTF-8 data is given to it?

我不能指望每个人都阅读整个Unicode标准,但如果有更多可消化的东西,我想与团队分享,所以我们不再遇到这些问题。

I can't expect everyone to read the whole Unicode standard, but if there is something more digestible available, I'd like to share it with the team so we don't run into these issues again.

重写所有应用程序以使用wchar_t或某些其他字符串表示形式现在是不可行的。我还要注意,这些应用程序通过网络与使用8位字符的服务器和设备进行通信,因此即使我们在内部使用Unicode,我们仍然会在边界处出现翻译问题。在大多数情况下,这些应用程序只传递数据;他们不会以任何方式处理文本,而不是将其从一个地方复制到另一个地方。

Re-writing all the applications to use wchar_t or some other string representation is not feasible at this time. I'll also note that these applications communicate over networks with servers and devices that use 8-bit characters, so even if we did Unicode internally, we'd still have issues with translation at the boundaries. For the most part, these applications just pass data around; they don't "process" the text in any way other than copying it from place to place.

所使用的操作系统是Windows和Linux。我们使用std :: string和普通的C字符串。 (不要问我保护任何设计决定,我只是想帮助解决这个混乱。)

The operating systems used are Windows and Linux. We use std::string and plain-old C strings. (And don't ask me to defend any of the design decisions. I'm just trying to help fix the mess.)


这里是一个建议的列表:

Here is a list of what has been suggested:

  • The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
  • UTF-8 and Unicode FAQ for Unix/Linux
  • The Unicode HOWTO

推荐答案

这看起来像一个全面的快速指南:

http://www.cl.cam.ac.uk/ 〜mgk25 / unicode.html

This looks like a comprehensive quick guide:
http://www.cl.cam.ac.uk/~mgk25/unicode.html

这篇关于如何编写对UTF-8安全的代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆