什么是TCHAR字符串和Win32 API函数中的“A”或“W”的版本? [英] What are TCHAR strings and the 'A' or 'W' version of Win32 API functions?

查看:122
本文介绍了什么是TCHAR字符串和Win32 API函数中的“A”或“W”的版本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是 TCHAR 字符串,例如 LPTSTR LPCTSTR 我怎么能与这些工作吗?当我创建在Visual Studio中新的项目为我创建这个code:

What are TCHAR strings, such as LPTSTR and LPCTSTR and how can I work with these? When I create a new project in Visual Studio it creates this code for me:

#include <tchar.h>

int _tmain(int argc, _TCHAR* argv[])
{
   return 0;
}

我如何,例如连接所有命令行参数?

How can I, for instance, concatenate all the command line arguments?

如果我要开的第一个命令行参数给出的名称的文件,我该怎么办呢? Windows的API定义'A'和它的许多功能的W的版本,如的CreateFile CreateFileA的 CreateFileW ;那么,如何从这些彼此不同,哪一个我应该使用?

If I'd want to open a file with the name given by the first command line argument, how can I do this? The Windows API defines 'A' and 'W' versions of many of its functions, such as CreateFile, CreateFileA and CreateFileW; so how do these differ from one another and which one should I use?

推荐答案

让我说,你应该preferably不使用开始 TCHAR 为新的Windows项目而是直接采用UNI code。在实际的答案是:

Let me start off by saying that you should preferably not use TCHAR for new Windows projects and instead directly use Unicode. On to the actual answer:

我们需要了解的第一件事是字符集在Visual Studio中是如何工作的。该项目的属性页有一个选项来选择所使​​用的字符集:

The first thing we need to understand is how character sets work in Visual Studio. The project property page has an option to select the character set used:


  • 未设置

  • 使用统一code字符集

  • 使用多字节字符集

根据您选择三个选项中,有很多的定义变化,以适应选定的字符集。有三种主要类型:字符串,从 TCHAR.H 字符串例程和API函数:

Depending on which of the three option you choose, a lot of definitions change to accommodate the selected character set. There are three main classes: strings, string routines from tchar.h, and API functions:


  • 未设置使用ANSI编码,在您使用该系统的标准8位code页的字符串对应于 TCHAR =字符。所有 TCHAR.H 字符串程序使用基本的字符版本。与字符串工作的所有API函数将使用API​​函数的'A'版本。

  • 使用UTF-16编码统一code'对应于 TCHAR = wchar_t的。所有 TCHAR.H 字符串程序使用 wchar_t的版本。与字符串工作的所有API函数将使用API​​函数的W的版本。

  • '多字节'对应于 TCHAR =字符,使用一些多字节编码方案。所有 TCHAR.H 字符串程序使用多字节字符集的版本。与字符串工作的所有API函数将使用API​​函数的'A'版本。

  • 'Not Set' corresponds to TCHAR = char using ANSI encoding, where you use the standard 8-bit code page of the system for strings. All tchar.h string routines use the basic char versions. All API functions that work with strings will use the 'A' version of the API function.
  • 'Unicode' corresponds to TCHAR = wchar_t using UTF-16 encoding. All tchar.h string routines use the wchar_t versions. All API functions that work with strings will use the 'W' version of the API function.
  • 'Multi-Byte' corresponds to TCHAR = char, using some multi-byte encoding scheme. All tchar.h string routines use the multi-byte character set versions. All API functions that work with strings will use the 'A' version of the API function.

相关阅读:<一href=\"http://stackoverflow.com/questions/9349342/about-the-character-set-option-in-visual-studio-2010\">About在&QUOT;字符集&QUOT;在Visual Studio 2010 选项

TCHAR.H 头是使用通用名称对字符串中使用C字符串操作,即切换到给定字符集的正确功能的帮手。例如, _tcscat 将切换为 strcat的(未设置), wcscat (UNI code),或 _mbscat (MBCS)。 _tcslen 将切换为的strlen (未设置), wcslen (UNI code),或的strlen (MBCS)。

The tchar.h header is a helper for using generic names for the C string operations on strings, that switch to the correct function for the given character set. For instance, _tcscat will switch to either strcat (not set), wcscat (unicode), or _mbscat (mbcs). _tcslen will switch to either strlen (not set), wcslen (unicode), or strlen (mbcs).

交换机通过定义评估正确的函数的所有 _txxx 符号宏的,根据不同的编译器开关发生。

The switch happens by defining all _txxx symbols as macro's that evaluate to the correct function, depending on the compiler switches.

这背后的想法是,你可以使用的编码无关的类型 TCHAR (或 _TCHAR )和编码无关的函数,对他们的工作,从 TCHAR.H ,而不是从文件string.h 常规字符串函数

The idea behind it is that you can use the encoding-agnostic types TCHAR (or _TCHAR) and the encoding-agnostic functions that work on them, from tchar.h, instead of the regular string functions from string.h.

同样, _tmain 被定义为两种 wmain 。参见:<一href=\"http://stackoverflow.com/questions/895827/what-is-the-difference-between-tmain-and-main-in-c\">What在C _tmain()和main()的++?

Similarly, _tmain is defined to be either main or wmain. See also: What is the difference between _tmain() and main() in C++?

一个帮手宏 _T(..)获取正确类型的字符串字面定义,无论是常规文本 Lwchar_t的文字​​

A helper macro _T(..) is defined for getting string literals of the correct type, either "regular literals" or L"wchar_t literals".

看到这里提到的注意事项:是TCHAR还有用吗? - dan04的回答

See the caveats mentioned here: Is TCHAR still relevant? -- dan04's answer

有关的主要问题中的例子,下面的code并置作为命令行参数传递的所有字符串成一个。

For the example of main in the question, the following code concatenates all the strings passed as command line arguments into one.

int _tmain(int argc, _TCHAR *argv[])
{
   TCHAR szCommandLine[1024];

   if (argc < 2) return 0;

   _tcscpy(szCommandLine, argv[1]);
   for (int i = 2; i < argc; ++i)
   {
       _tcscat(szCommandLine, _T(" "));
       _tcscat(szCommandLine, argv[i]);
   }

   /* szCommandLine now contains the command line arguments */

   return 0;
}

(省略错误检查)这code适用于所有三种情况下的字符集,因为我们到处都使用 TCHAR TCHAR.H 字符串函数和 _T 的字符串。忘记与 _T(..)包围的字符串是这样写 TCHAR当编译器错误的常见原因 - 程式。
如果我们没有做这些事,然后切换字符集将导致code要么无法编译,或者更糟,编译,但在运行时行为不端。

(Error checking is omitted) This code works for all three cases of the character set, because everywhere we used TCHAR, the tchar.h string functions and _T for string literals. Forgetting to surround your string literals with _T(..) is a common source of compiler errors when writing such TCHAR-programs. If we had not done all these things, then switching character sets would cause the code to either not compile, or worse, compile but misbehave during runtime.

Windows API函数,如的CreateFile GetCurrentDirectory ,在Windows的标头,实施宏的,像 TCHAR.H 宏的,切换到无论是'A'版本或W版本。例如,的CreateFile 是被定义为 CreateFileA的对于ANSI和MBCS宏,以及 CreateFileW 统一为code。

Windows API functions that work on strings, such as CreateFile and GetCurrentDirectory, are implemented in the Windows headers as macro's that, like the tchar.h macro's, switch to either the 'A' version or 'W' version. For instance, CreateFile is a macro that is defined to CreateFileA for ANSI and MBCS, and to CreateFileW for Unicode.

当你在code,呼吁将根据所选择的字符集转换的实际功能使用平面形式(不'A'或'W')。您可以通过使用明确的'A'或'W'名强制使用的特定版本。

Whenever you use the flat form (without 'A' or 'W') in your code, the actual function called will switch depending on the selected character set. You can force the use of a particular version by using the explicit 'A' or 'W' names.

结论是,你应该总是使用不合格的名称,除非你想的总是的指特定的版本,独立于字符集的选项。

The conclusion is that you should always use the unqualified name, unless you want to always refer to a specific version, independently of the character set option.

有关的问题,在这里我们要打开的第一个参数指定的文件的例子:

For the example in the question, where we want to open the file given by the first argument:

int _tmain(int argc, _TCHAR *argv[])
{  
   if (argc < 2) return 1;

   HANDLE hFile = CreateFile(argv[1], GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL);

   /* Read from file and do other stuff */
   ...

   CloseHandle(hFile);

   return 0;
}

(省略错误检查)请注意,在这个例子中,无处我们需要使用任何的 TCHAR 具体的东西,因为宏定义已经采取的这种关心对我们来说。

(Error checking is omitted) Note that for this example, nowhere we needed to use any of the TCHAR specific stuff, because the macro definitions have already taken care of this for us.

我们已经看到了,我们怎么可以使用 TCHAR.H 程序用C风格的字符串操作与 TCHAR S,但它会很好,如果我们可以利用C ++ 字符串 s到这项工作。

We've seen how we can use the tchar.h routines to use C style string operations to work with TCHARs, but it would be nice if we could leverage C++ strings to work with this.

我的建议最重要的是不要使用 TCHAR ,而使用统一code直接看结论部分,但如果你想用<$ C工作$ C> TCHAR 你可以做到以下几点。

My advice would foremost be to not use TCHAR and instead use Unicode directly, see the Conclusion section, but if you want to work with TCHAR you can do the following.

要使用 TCHAR ,我们要的是的std :: basic_string的的一个实例,使用 TCHAR 。您可以通过的typedef ING这样做你自己的 tstring

To use TCHAR, what we want is an instance of std::basic_string that uses TCHAR. You can do this by typedefing your own tstring:

typedef std::basic_string<TCHAR> tstring;

有关字符串,不要忘记使用 _T

For string literals, don't forget to use _T.

您还需要使用的正确版本CIN COUT 。您可以使用引用实施 TCIN TCOUT

You'll also need to use the correct versions of cin and cout. You can use references to implement a tcin and tcout:

#if defined(_UNICODE)
std::wistream &tcin = wcin;
std::wostream &tcout = wcout;
#else
std::istream &tcin = cin;
std::ostream &tcout = cout;
#end

这应该让你做几乎任何事情。有可能是偶尔的例外,如的std :: to_string 的std :: to_wstring ,以便您可以找到类似的解决办法。

This should allow you to do almost anything. There might be the occasional exception, such as std::to_string and std::to_wstring, for which you can find a similar workaround.

这答案(希望)详细说明了 TCHAR ,以及它是如何使用和Visual Studio和Windows头交织在一起。但是,我们也应该知道,如果我们要使用它。

This answer (hopefully) details what TCHAR is and how it's used and intertwined with Visual Studio and the Windows headers. However, we should also wonder if we want to use it.

我的建议是直接采用UNI code为所有新的Windows程序,并在所有不使用 TCHAR

My advice is to directly use Unicode for all new Windows programs and don't use TCHAR at all!

其他给予同样的建议:是TCHAR仍然相关

Others giving the same advice: Is TCHAR still relevant?

要使用统一code创建新项目后,首先要确保字符集设置为统一code。然后,删除的#include&LT; TCHAR.H&GT;从源文件(或的stdafx.h )。修复了任何 TCHAR _TCHAR wchar_t的 _tmain wmain

To use Unicode after creating a new project, first ensure the character set is set to Unicode. Then, remove the #include <tchar.h> from your source file (or from stdafx.h). Fix up any TCHAR or _TCHAR to wchar_t and _tmain to wmain:

int wmain(int argc, wchar_t *argv[])

有关非控制台项目,切入点Windows应用程序的的WinMain ,将出现在 TCHAR -jargon为

For non-console projects, the entry point for Windows applications is WinMain and will appear in TCHAR-jargon as

int APIENTRY _tWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPTSTR    lpCmdLine, int nCmdShow)

和应该成为

int APIENTRY wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPWSTR    lpCmdLine, int nCmdShow)

在此之后,只能使用 wchar_t的字符串和/或的std :: wstring的秒。


  • 写作时要小心的sizeof(szMyString)使用 TCHAR 阵列(串)时,因为ANSI这是的大小都在字符和以字节为单位的Uni code这只是字节大小和字符数至多为一半,以及用于MBCS这是字节的大小和字符数可能会或可能不会等于。在UNI code和MBCS可以使用多个 TCHAR s到EN code单个字符。

  • 混合 TCHAR 的东西和固定字符 wchar_t的很讨厌;你必须字符串从一个到另一个转换,使用正确的code页面!一个简单的副本不会在一般情况下正常工作。

  • 之间存在 _UNI code UNI code ,相关的一个细微的差别,如果你希望有条件地定义自己的函数。请参见为什么在UNI code和_UNI code ?

  • Be careful when writing sizeof(szMyString) when using TCHAR arrays (strings), because for ANSI this is the size both in characters and in bytes, for Unicode this is only the size in bytes and the number of characters is at most half, and for MBCS this is the size in bytes and the number of characters may or may not be equal. Both Unicode and MBCS can use multiple TCHARs to encode a single character.
  • Mixing TCHAR stuff and fixed char or wchar_t is very annoying; you have to convert the strings from one to the other, using the correct code page! A simple copy will not work in the general case.
  • There is a slight difference between _UNICODE and UNICODE, relevant if you want to conditionally define your own functions. See Why both UNICODE and _UNICODE?

有一个很不错的,互补的答案是:<一href=\"http://stackoverflow.com/questions/3298569/difference-between-mbcs-and-utf-8-on-windows\">Difference在Windows

A very good, complementary answer is: Difference between MBCS and UTF-8 on Windows

这篇关于什么是TCHAR字符串和Win32 API函数中的“A”或“W”的版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆