什么是TCHAR字符串和Win32 API函数中的“A”或“W”的版本? [英] What are TCHAR strings and the 'A' or 'W' version of Win32 API functions?
问题描述
什么是 TCHAR
字符串,例如 LPTSTR
和 LPCTSTR
我怎么能与这些工作吗?当我创建在Visual Studio中新的项目为我创建这个code:
What are TCHAR
strings, such as LPTSTR
and LPCTSTR
and how can I work with these? When I create a new project in Visual Studio it creates this code for me:
#include <tchar.h>
int _tmain(int argc, _TCHAR* argv[])
{
return 0;
}
我如何,例如连接所有命令行参数?
How can I, for instance, concatenate all the command line arguments?
如果我要开的第一个命令行参数给出的名称的文件,我该怎么办呢? Windows的API定义'A'和它的许多功能的W的版本,如的CreateFile
, CreateFileA的
和 CreateFileW
;那么,如何从这些彼此不同,哪一个我应该使用?
If I'd want to open a file with the name given by the first command line argument, how can I do this? The Windows API defines 'A' and 'W' versions of many of its functions, such as CreateFile
, CreateFileA
and CreateFileW
; so how do these differ from one another and which one should I use?
推荐答案
让我说,你应该preferably不使用开始 TCHAR
为新的Windows项目而是直接采用UNI code。在实际的答案是:
Let me start off by saying that you should preferably not use TCHAR
for new Windows projects and instead directly use Unicode. On to the actual answer:
我们需要了解的第一件事是字符集在Visual Studio中是如何工作的。该项目的属性页有一个选项来选择所使用的字符集:
The first thing we need to understand is how character sets work in Visual Studio. The project property page has an option to select the character set used:
- 未设置
- 使用统一code字符集
- 使用多字节字符集
根据您选择三个选项中,有很多的定义变化,以适应选定的字符集。有三种主要类型:字符串,从 TCHAR.H
字符串例程和API函数:
Depending on which of the three option you choose, a lot of definitions change to accommodate the selected character set. There are three main classes: strings, string routines from tchar.h
, and API functions:
- 未设置使用ANSI编码,在您使用该系统的标准8位code页的字符串对应于
TCHAR =字符
。所有TCHAR.H
字符串程序使用基本的字符
版本。与字符串工作的所有API函数将使用API函数的'A'版本。 - 使用UTF-16编码统一code'对应于
TCHAR = wchar_t的
。所有TCHAR.H
字符串程序使用wchar_t的
版本。与字符串工作的所有API函数将使用API函数的W的版本。 - '多字节'对应于
TCHAR =字符
,使用一些多字节编码方案。所有TCHAR.H
字符串程序使用多字节字符集的版本。与字符串工作的所有API函数将使用API函数的'A'版本。
- 'Not Set' corresponds to
TCHAR = char
using ANSI encoding, where you use the standard 8-bit code page of the system for strings. Alltchar.h
string routines use the basicchar
versions. All API functions that work with strings will use the 'A' version of the API function. - 'Unicode' corresponds to
TCHAR = wchar_t
using UTF-16 encoding. Alltchar.h
string routines use thewchar_t
versions. All API functions that work with strings will use the 'W' version of the API function. - 'Multi-Byte' corresponds to
TCHAR = char
, using some multi-byte encoding scheme. Alltchar.h
string routines use the multi-byte character set versions. All API functions that work with strings will use the 'A' version of the API function.
相关阅读:<一href=\"http://stackoverflow.com/questions/9349342/about-the-character-set-option-in-visual-studio-2010\">About在&QUOT;字符集&QUOT;在Visual Studio 2010 选项
的 TCHAR.H
头是使用通用名称对字符串中使用C字符串操作,即切换到给定字符集的正确功能的帮手。例如, _tcscat
将切换为 strcat的
(未设置), wcscat
(UNI code),或 _mbscat
(MBCS)。 _tcslen
将切换为的strlen
(未设置), wcslen
(UNI code),或的strlen
(MBCS)。
The tchar.h
header is a helper for using generic names for the C string operations on strings, that switch to the correct function for the given character set. For instance, _tcscat
will switch to either strcat
(not set), wcscat
(unicode), or _mbscat
(mbcs). _tcslen
will switch to either strlen
(not set), wcslen
(unicode), or strlen
(mbcs).
交换机通过定义评估正确的函数的所有 _txxx
符号宏的,根据不同的编译器开关发生。
The switch happens by defining all _txxx
symbols as macro's that evaluate to the correct function, depending on the compiler switches.
这背后的想法是,你可以使用的编码无关的类型 TCHAR
(或 _TCHAR
)和编码无关的函数,对他们的工作,从 TCHAR.H
,而不是从文件string.h
常规字符串函数
The idea behind it is that you can use the encoding-agnostic types TCHAR
(or _TCHAR
) and the encoding-agnostic functions that work on them, from tchar.h
, instead of the regular string functions from string.h
.
同样, _tmain
被定义为两种主
或 wmain
。参见:<一href=\"http://stackoverflow.com/questions/895827/what-is-the-difference-between-tmain-and-main-in-c\">What在C _tmain()和main()的++?
Similarly, _tmain
is defined to be either main
or wmain
. See also: What is the difference between _tmain() and main() in C++?
一个帮手宏 _T(..)
获取正确类型的字符串字面定义,无论是常规文本
或 Lwchar_t的文字
。
A helper macro _T(..)
is defined for getting string literals of the correct type, either "regular literals"
or L"wchar_t literals"
.
看到这里提到的注意事项:是TCHAR还有用吗? - dan04的回答
See the caveats mentioned here: Is TCHAR still relevant? -- dan04's answer
有关的主要问题中的例子,下面的code并置作为命令行参数传递的所有字符串成一个。
For the example of main in the question, the following code concatenates all the strings passed as command line arguments into one.
int _tmain(int argc, _TCHAR *argv[])
{
TCHAR szCommandLine[1024];
if (argc < 2) return 0;
_tcscpy(szCommandLine, argv[1]);
for (int i = 2; i < argc; ++i)
{
_tcscat(szCommandLine, _T(" "));
_tcscat(szCommandLine, argv[i]);
}
/* szCommandLine now contains the command line arguments */
return 0;
}
(省略错误检查)这code适用于所有三种情况下的字符集,因为我们到处都使用 TCHAR
的 TCHAR.H
字符串函数和 _T
的字符串。忘记与 _T(..)包围的字符串
是这样写 TCHAR当编译器错误的常见原因
- 程式。
如果我们没有做这些事,然后切换字符集将导致code要么无法编译,或者更糟,编译,但在运行时行为不端。
(Error checking is omitted) This code works for all three cases of the character set, because everywhere we used TCHAR
, the tchar.h
string functions and _T
for string literals. Forgetting to surround your string literals with _T(..)
is a common source of compiler errors when writing such TCHAR
-programs.
If we had not done all these things, then switching character sets would cause the code to either not compile, or worse, compile but misbehave during runtime.
Windows API函数,如的CreateFile
和 GetCurrentDirectory
,在Windows的标头,实施宏的,像 TCHAR.H
宏的,切换到无论是'A'版本或W版本。例如,的CreateFile
是被定义为 CreateFileA的
对于ANSI和MBCS宏,以及 CreateFileW
统一为code。
Windows API functions that work on strings, such as CreateFile
and GetCurrentDirectory
, are implemented in the Windows headers as macro's that, like the tchar.h
macro's, switch to either the 'A' version or 'W' version. For instance, CreateFile
is a macro that is defined to CreateFileA
for ANSI and MBCS, and to CreateFileW
for Unicode.
当你在code,呼吁将根据所选择的字符集转换的实际功能使用平面形式(不'A'或'W')。您可以通过使用明确的'A'或'W'名强制使用的特定版本。
Whenever you use the flat form (without 'A' or 'W') in your code, the actual function called will switch depending on the selected character set. You can force the use of a particular version by using the explicit 'A' or 'W' names.
结论是,你应该总是使用不合格的名称,除非你想的总是的指特定的版本,独立于字符集的选项。
The conclusion is that you should always use the unqualified name, unless you want to always refer to a specific version, independently of the character set option.
有关的问题,在这里我们要打开的第一个参数指定的文件的例子:
For the example in the question, where we want to open the file given by the first argument:
int _tmain(int argc, _TCHAR *argv[])
{
if (argc < 2) return 1;
HANDLE hFile = CreateFile(argv[1], GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL);
/* Read from file and do other stuff */
...
CloseHandle(hFile);
return 0;
}
(省略错误检查)请注意,在这个例子中,无处我们需要使用任何的 TCHAR
具体的东西,因为宏定义已经采取的这种关心对我们来说。
(Error checking is omitted) Note that for this example, nowhere we needed to use any of the TCHAR
specific stuff, because the macro definitions have already taken care of this for us.
我们已经看到了,我们怎么可以使用 TCHAR.H
程序用C风格的字符串操作与 TCHAR $ C $工作C> S,但它会很好,如果我们可以利用C ++
字符串
s到这项工作。
We've seen how we can use the tchar.h
routines to use C style string operations to work with TCHAR
s, but it would be nice if we could leverage C++ string
s to work with this.
我的建议最重要的是不要使用 TCHAR
,而使用统一code直接看结论部分,但如果你想用<$ C工作$ C> TCHAR 你可以做到以下几点。
My advice would foremost be to not use TCHAR
and instead use Unicode directly, see the Conclusion section, but if you want to work with TCHAR
you can do the following.
要使用 TCHAR
,我们要的是的std :: basic_string的
的一个实例,使用 TCHAR
。您可以通过的typedef
ING这样做你自己的 tstring
:
To use TCHAR
, what we want is an instance of std::basic_string
that uses TCHAR
. You can do this by typedef
ing your own tstring
:
typedef std::basic_string<TCHAR> tstring;
有关字符串,不要忘记使用 _T
。
For string literals, don't forget to use _T
.
您还需要使用的正确版本CIN
和 COUT
。您可以使用引用实施 TCIN
和 TCOUT
:
You'll also need to use the correct versions of cin
and cout
. You can use references to implement a tcin
and tcout
:
#if defined(_UNICODE)
std::wistream &tcin = wcin;
std::wostream &tcout = wcout;
#else
std::istream &tcin = cin;
std::ostream &tcout = cout;
#end
这应该让你做几乎任何事情。有可能是偶尔的例外,如的std :: to_string
和的std :: to_wstring
,以便您可以找到类似的解决办法。
This should allow you to do almost anything. There might be the occasional exception, such as std::to_string
and std::to_wstring
, for which you can find a similar workaround.
这答案(希望)详细说明了 TCHAR
,以及它是如何使用和Visual Studio和Windows头交织在一起。但是,我们也应该知道,如果我们要使用它。
This answer (hopefully) details what TCHAR
is and how it's used and intertwined with Visual Studio and the Windows headers. However, we should also wonder if we want to use it.
我的建议是直接采用UNI code为所有新的Windows程序,并在所有不使用 TCHAR
!
My advice is to directly use Unicode for all new Windows programs and don't use TCHAR
at all!
其他给予同样的建议:是TCHAR仍然相关
Others giving the same advice: Is TCHAR still relevant?
要使用统一code创建新项目后,首先要确保字符集设置为统一code。然后,删除的#include&LT; TCHAR.H&GT;从源文件
(或的stdafx.h
)。修复了任何 TCHAR
或 _TCHAR
到 wchar_t的
和 _tmain
到 wmain
:
To use Unicode after creating a new project, first ensure the character set is set to Unicode. Then, remove the #include <tchar.h>
from your source file (or from stdafx.h
). Fix up any TCHAR
or _TCHAR
to wchar_t
and _tmain
to wmain
:
int wmain(int argc, wchar_t *argv[])
有关非控制台项目,切入点Windows应用程序的的WinMain
,将出现在 TCHAR
-jargon为
For non-console projects, the entry point for Windows applications is WinMain
and will appear in TCHAR
-jargon as
int APIENTRY _tWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPTSTR lpCmdLine, int nCmdShow)
和应该成为
int APIENTRY wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPWSTR lpCmdLine, int nCmdShow)
在此之后,只能使用 wchar_t的
字符串和/或的std :: wstring的
秒。
- 写作时要小心
的sizeof(szMyString)使用
的大小都在字符和以字节为单位的Uni code这只是字节大小和字符数至多为一半,以及用于MBCS这是字节的大小和字符数可能会或可能不会等于。在UNI code和MBCS可以使用多个TCHAR
阵列(串)时,因为ANSI这是TCHAR
s到EN code单个字符。 - 混合
TCHAR
的东西和固定字符
或wchar_t的
很讨厌;你必须字符串从一个到另一个转换,使用正确的code页面!一个简单的副本不会在一般情况下正常工作。 - 之间存在
_UNI code
和UNI code
,相关的一个细微的差别,如果你希望有条件地定义自己的函数。请参见为什么在UNI code和_UNI code ?
- Be careful when writing
sizeof(szMyString)
when usingTCHAR
arrays (strings), because for ANSI this is the size both in characters and in bytes, for Unicode this is only the size in bytes and the number of characters is at most half, and for MBCS this is the size in bytes and the number of characters may or may not be equal. Both Unicode and MBCS can use multipleTCHAR
s to encode a single character. - Mixing
TCHAR
stuff and fixedchar
orwchar_t
is very annoying; you have to convert the strings from one to the other, using the correct code page! A simple copy will not work in the general case. - There is a slight difference between
_UNICODE
andUNICODE
, relevant if you want to conditionally define your own functions. See Why both UNICODE and _UNICODE?
有一个很不错的,互补的答案是:<一href=\"http://stackoverflow.com/questions/3298569/difference-between-mbcs-and-utf-8-on-windows\">Difference在Windows
A very good, complementary answer is: Difference between MBCS and UTF-8 on Windows
这篇关于什么是TCHAR字符串和Win32 API函数中的“A”或“W”的版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!