wchar_t的字符VS用于创建API [英] wchar_t vs char for creating an API

查看:184
本文介绍了wchar_t的字符VS用于创建API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建一个C ++库意思写在如Java,C#,Delphi的不同的语言不同的应用程序使用等。

I am creating a C++ library meant to be used with different applications written in different languages like Java, C#, Delphi etc.

时不时地,我被困在转换之间wstrings,字符串,字符*,为wchar_t *。例如。我要贴wchar_t的的,但不得不使用它接受的字符其他类似问题的正则表达式库。

Every now and then I am stuck on conversions between wstrings, strings, char*, wchar_t*. E.g. I sticked to wchar_t's but had to use regex library which accepts chars other similar problems.

我要坚持无论是W公司的或正常的字符串。我的图书馆将主要处理ASCII字符,但是可以有非ASCII字符也作为名称等,所以,我可以永久切换到烧焦的,而不是为wchar_t的和字符串的,而不是wstring的年代。我可以跟他们UNI code支持,并将它在不同的平台和语言的影响可扩展性和可移植性。

I wish to stick to either w's or normal strings. My library will mostly deal with ASCII characters but can have non-ASCII characters too as in names etc. So, can I permanently switch to char's instead of wchar_t's and string's instead of wstring's. Can I have unicode support with them and will it affect scalability and portability across different platforms and languages.

请指教。

推荐答案

您需要决定使用哪种编码。一些注意事项:

You need to decide which encoding to use. Some considerations:


  • 如果你可以有非ASCII字符,那么在选择ASCII或8位ANSI没有任何意义。这种方式带来失望和风险数据丢失。

  • If you can have non-ASCII characters, then there is no point in choosing ASCII or 8bit ANSI. That way leads to disappointment and risks data loss.

这是有道理选择一个编码,并坚持下去。到处。在Windows API是支持ANSI和统一code不寻常的,但是这是由于旧的软件向后兼容。如果微软是从零开始并且,那里将只有一个编码。

It makes sense to pick one encoding and stick to it. Everywhere. The Windows API is unusual in supporting both ANSI and Unicode, but that is due to backwards compatibility of old software. If Microsoft were starting over from scratch, there would be one encoding only.

有关统一code编码最常见的选择是UTF-8和UTF-16。任何像样的环境将有两种支持。无论哪种选择可能是合理的。

The most common choices for Unicode encoding are UTF-8 and UTF-16. Any decent environment will have support for both. Either choice may be justifiable.

的Java,VB,C#和Delphi都对UTF-16良好的支持,和他们都使用UTF-16的本地字符串类型(在Delphi的情况下,本地字符串类型是UTF-16仅在2009年德尔福及更高版本。对于早期版本,您可以使用 WideString的字符串类型)。

Java, VB, C# and Delphi all have good support for UTF-16, and all of them use UTF-16 for their native string types (in the case of Delp the native string type is UTF-16 only in Delphi 2009 and later. For earlier versions, you can use the WideString string type).

大多数OS平台是本地UTF-16(* nix系统,如Linux,是UTF-8,而不是),所以它很可能是最简单的只使用UTF-16。

Most OS platforms are natively UTF-16 (*Nix systems, like Linux, are UTF-8 instead), so it may well be simplest to just use UTF-16.

在另一方面,UTF-8可能是技术上更好的选择被字节导向,并用8位ASCII向后兼容。很可能,如果统一code正在从头发明,就没有UTF-16和UTF-8是可变长度编码。

On the other hand, UTF-8 is probably a technically better choice being byte oriented, and backwards compatible with 8bit ASCII. Quite likely, if Unicode was being invented from scratch, there would be no UTF-16 and UTF-8 would be the variable length encoding.

您已经表述了一个问题,字符 wchar_t的之间的选择。我认为真正的选择是你的preferred编码应该是什么。您还可以观看出 wchar_t的是16bit(UTF-16)在某些系统上,但在其他32位(UTF-32)。它不是一个便携式数据类型。这就是为什么C ++ 11引入了新的 char16_t 和char32_t`数据类型来纠正歧义。

You have phrased the question as a choice between char and wchar_t. I think that the real choice is what your preferred encoding should be. You also have to watch out that wchar_t is 16bit (UTF-16) on some systems but is 32bit (UTF-32) on others. It is not a portable data type. That is why C++11 introduces new char16_t and char32_t` data types to correct that ambiguity.

这篇关于wchar_t的字符VS用于创建API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆