如何在Linux API中使用wstring? [英] How can I use wstring(s) in Linux APIs?

查看:56
本文介绍了如何在Linux API中使用wstring?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在Linux中开发应用程序.我想使用wstring,因为我的应用程序应该支持unicode,并且我不想使用UTF-8字符串.

I want to develope an application in Linux. I want to use wstring beacuse my application should supports unicode and I don't want to use UTF-8 strings.

在Windows操作系统中,使用wstring很容易.因为任何ANSI API都具有unicode格式.例如有两个 CreateProcess API,第一个 API 是 CreateProcessA,第二个 API 是 CreateProcessW.

In Windows OS, using wstring is easy. beacuse any ANSI API has a unicode form. for example there are two CreateProcess API, first API is CreateProcessA and second API is CreateProcessW.

wstring app = L"C:\\test.exe";
CreateProcess
(
  app.c_str(), // EASY!
  ....
);

但是在Linux中使用wstring似乎很复杂!例如,Linux中有一个名为 parport_open (这只是一个例子).

But it seems working with wstring in Linux is complicated! for example there is an API in Linux called parport_open (It just an example).

而且我不知道如何将wstring发送到该API(或像parport_open这样的接受字符串参数的API).

and I don't know how to send my wstring to this API (or APIs like parport_open that accept a string parameter).

wstring name = L"myname";
parport_open
(
  0, // or a valid number. It is not important in this question.
  name.c_str(), // Error: because type of this parameter is char* not wchat_t*
  ....
);

我的问题是如何在Linux API中使用wstring?

My question is how can I use wstring(s) in Linux APIs?

注意:我不想使用UTF-8字符串.

Note: I don't want to use UTF-8 strings.

谢谢

推荐答案

Linux API(在最新的内核和正确的区域设置中)在几乎每个发行版上都默认使用 UTF-8 字符串1.您也应该在代码中使用它们.抵抗是徒劳的.

Linux APIs (on recent kernels and with correct locale setting) on almost every distribution use UTF-8 strings by default1. You too should use them inside your code. Resistance is futile.

仅当Unicode限制为65536个字符(即, wchar_t 用于UCS)时,Windows上的 wchar_t (因此是 wstring )才是方便的-2),因为16位Windows wchar_t 用于UTF-16,所以1 wchar_t = 1 Unicode字符的优势已不复存在,因此您拥有相同的功能使用UTF-8的缺点.如今,恕我直言,Linux方法是最正确的.(我在UTF-16上的另一个答案以及Windows和Java为什么使用它)

The wchar_t (and thus wstring) on Windows were convenient only when Unicode was limited to 65536 characters (i.e. wchar_t were used for UCS-2), now that the 16-bit Windows wchar_t are used for UTF-16 the advantage of 1 wchar_t=1 Unicode character is long gone, so you have the same disadvantages of using UTF-8. Nowadays IMHO the Linux approach is the most correct. (Another answer of mine on UTF-16 and why Windows and Java use it)

顺便说一句, string wstring 都不支持编码,因此您不能可靠地使用这两种方法来处理Unicode代码点.我听说wxWidgets工具包中的 wxString 可以很好地处理UTF-8,但是我从未对此做过广泛的研究.

By the way, both string and wstring aren't encoding-aware, so you can't reliably use any of these two to manipulate Unicode code points. I heard that wxString from the wxWidgets toolkit handles UTF-8 nicely, but I never did extensive research about it.

    实际上,如下文所述,内核旨在与编码无关,即,将字符串视为不透明的序列(以NUL终止?)字节(这就是为什么使用较大"字符类型(如UTF-16个不能使用).另一方面,无论实际执行什么字符串操作,都将使用当前的语言环境设置,默认情况下,几乎在所有现代Linux发行版中,默认设置都将其设置为UTF-8(这对我来说是一个合理的默认设置).
  1. actually, as pointed out below, the kernel aims to be encoding-agnostic, i.e. it treats the strings as opaque sequences of (NUL-terminated?) bytes (and that's why encodings that use "larger" character types like UTF-16 cannot be used). On the other hand, wherever actual string manipulation is done, the current locale setting is used, and by default on almost any modern Linux distribution it is set to UTF-8 (which is a reasonable default to me).

这篇关于如何在Linux API中使用wstring?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆