什么是TCHAR字符串和Win32 API函数的'A'或'W'版本? [英] What are TCHAR strings and the 'A' or 'W' version of Win32 API functions?
问题描述
什么是 TCHAR
字符串,例如 LPTSTR
和 LPCTSTR
如何使用这些?当我在Visual Studio中创建一个新项目时,它为我创建了这个代码:
#include< tchar.h>
int _tmain(int argc,_TCHAR * argv [])
{
return 0;
}
例如,如何连接所有命令行参数? p>
如果我想使用第一个命令行参数打开一个文件名,我该怎么做呢? Windows API定义其许多函数的A和W版本,例如 CreateFile
, CreateFileA
CreateFileW
;那么我们应该如何使用它们呢?
让我先说对于新的Windows项目,不使用 TCHAR
,而是直接使用Unicode。到实际的答案:
字符集
我们需要了解的第一件事是字符集如何工作在Visual Studio中。项目属性页有一个选项可用于选择所使用的字符集:
- 未设置
- Unicode字符集
- 使用多字节字符集
根据您选择的三个选项中的哪一个,很多定义改变以适应所选择的字符集。有三个主要类:字符串, tchar.h
的字符串例程,以及API函数:
- 'Not Set'对应于使用ANSI编码的
TCHAR = char
,其中使用系统的标准8位代码页作为字符串。所有tchar.h
字符串例程使用基本的char
版本。所有使用字符串的API函数将使用API函数的'A'版本。 - 'Unicode'对应于
TCHAR = wchar_t
使用UTF-16编码。所有tchar.h
字符串例程使用wchar_t
版本。所有使用字符串的API函数都将使用API函数的W版本。 - Multi-Byte对应于
TCHAR = char
,使用一些多字节编码方案。所有tchar.h
字符串例程使用多字节字符集版本。所有使用字符串的API函数都将使用API函数的'A'版本。
相关阅读:关于字符集 option in visual studio 2010
TCHAR.h标题
tchar.h
头是一个帮助器,用于将字符串的C字符串操作使用通用名称,切换到给定字符集的正确函数。例如, _tcscat
将切换到 strcat
(未设置), wcscat
(unicode)或 _mbscat
(mbcs)。 _tcslen
将切换到 strlen
(未设置), wcslen
(unicode)或 strlen
(mbcs)。
切换通过定义 _txxx
符号作为宏,根据编译器开关评估为正确的函数。
它的想法是,编码不可知类型 TCHAR
(或 _TCHAR
)和对它们工作的编码不可知函数,从 tchar.h
,而不是 string.h
的常规字符串函数。
同样, _tmain
定义为 main
或 wmain
。另请参见: _tmain()和main()in C ++?
帮助宏 _T(..)
正确类型的字符串字面值,正则字面值
或 Lwchar_t literals
。
请参阅此处提及的警告: TCHAR是否仍然相关? - dan04的回答
_tmain
示例
对于问题的main示例,以下代码将作为命令行参数传递的所有字符串连接到一个。
int _tmain(int argc,_TCHAR * argv [])
{
TCHAR szCommandLine [1024]
if(argc <2)return 0;
_tcscpy(szCommandLine,argv [1]);
for(int i = 2; i {
_tcscat(szCommandLine,_T());
_tcscat(szCommandLine,argv [i]);
}
/ * szCommandLine现在包含命令行参数* /
return 0;
}
(省略错误检查)此代码适用于所有三种字符因为我们使用 TCHAR
, tchar.h
字符串函数和 _T
用于字符串文字。在编写
TCHAR
时, _T(..)
是编译器错误的常见来源,程式。
如果我们没有做所有这些事情,那么切换字符集会导致代码不能编译,或者更糟的是,编译,但运行时不正常。
Windows API函数
用于字符串的Windows API函数,例如 无论何时在代码中使用平面形式(不带A或W),实际功能调用将根据所选字符集切换。你可以通过使用显式的'A'或'W'名称强制使用特定的版本。 结论是你应该总是使用非限定名称,除非 对于问题中的示例,我们想要的是打开第一个参数给出的文件: (错误检查被省略)请注意,对于这个例子,使用任何 我们已经看到了如何使用 我的建议最重要的是不要使用 使用 对于字符串文字,不要忘记使用 您还需要使用正确版本的 这应该允许你做任何事情。可能存在偶尔的异常,例如 这个答案(希望)详细说明 我的建议是直接对所有新的Windows程序使用Unicode,不要使用 其他人给出相同的建议: TCHAR是否仍然相关? 在创建新项目后使用Unicode确保字符集设置为Unicode。然后,从源文件(或从 对于非控制台项目, Windows应用程序的入口点为 并应该成为 之后,只使用 一个很好的补充答案是:在Windows上MBCS和UTF-8之间的区别 What are How can I, for instance, concatenate all the command line arguments? If I'd want to open a file with the name given by the first command line argument, how can I do this? The Windows API defines 'A' and 'W' versions of many of its functions, such as Let me start off by saying that you should preferably not use The first thing we need to understand is how character sets work in Visual Studio. The project property page has an option to select the character set used: Depending on which of the three option you choose, a lot of definitions change to accommodate the selected character set. There are three main classes: strings, string routines from Related reading: About the "Character set" option in visual studio 2010 The The switch happens by defining all The idea behind it is that you can use the encoding-agnostic types Similarly, A helper macro See the caveats mentioned here: Is TCHAR still relevant? -- dan04's answer For the example of main in the question, the following code concatenates all the strings passed as command line arguments into one. (Error checking is omitted) This code works for all three cases of the character set, because everywhere we used Windows API functions that work on strings, such as Whenever you use the flat form (without 'A' or 'W') in your code, the actual function called will switch depending on the selected character set. You can force the use of a particular version by using the explicit 'A' or 'W' names. The conclusion is that you should always use the unqualified name, unless you want to always refer to a specific version, independently of the character set option. For the example in the question, where we want to open the file given by the first argument: (Error checking is omitted) Note that for this example, nowhere we needed to use any of the We've seen how we can use the My advice would foremost be to not use To use For string literals, don't forget to use You'll also need to use the correct versions of This should allow you to do almost anything. There might be the occasional exception, such as This answer (hopefully) details what My advice is to directly use Unicode for all new Windows programs and don't use Others giving the same advice: Is TCHAR still relevant? To use Unicode after creating a new project, first ensure the character set is set to Unicode. Then, remove the For non-console projects, the entry point for Windows applications is and should become After this, only use A very good, complementary answer is: Difference between MBCS and UTF-8 on Windows 这篇关于什么是TCHAR字符串和Win32 API函数的'A'或'W'版本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! CreateFile
和 GetCurrentDirectory
在Windows头中实现为宏,像 tchar.h
宏,切换到'A'版本或'W'版。例如, CreateFile
是为ANSI和MBCS定义为 CreateFileA
的宏,
int _tmain(int argc,_TCHAR * argv [])
{
if(argc <2)return 1;
HANDLE hFile = CreateFile(argv [1],GENERIC_READ,0,NULL,OPEN_EXISTING,0,NULL);
/ *读取文件并做其他事情* /
...
CloseHandle(hFile);
return 0;
}
TCHAR
特定的东西,因为宏定义已经为我们照顾这一点。
利用C ++字符串
tchar.h
例程来使用C样式字符串操作使用 TCHAR
,但如果我们可以使用C ++ string
来处理这个, p>
TCHAR
,而是直接使用Unicode,请参阅结论部分,要使用 TCHAR
,您可以执行以下操作。
TCHAR
,我们想要的是使用 TCHAR
的 std :: basic_string
的实例。您可以通过 typedef
来执行此操作 tstring
:
typedef std :: basic_string< TCHAR> tstring;
_T
。
cin
和 cout
。您可以使用引用来实现 tcin
和 tcout
:
#if defined(_UNICODE)
std :: wistream& tcin = wcin;
std :: wostream& tcout = wcout;
#else
std :: istream& tcin = cin;
std :: ostream& tcout = cout;
#end
std :: to_string
和 std :: to_wstring
,您可以找到类似的解决方法。
结论
TCHAR
是,它如何使用和交织在Visual Studio和Windows标题。
TCHAR
!
stdafx.h
)中删除 #include< tchar.h>
将任何 TCHAR
或 _TCHAR
修改为 wchar_t
和 _tmain
至 wmain
:
int wmain(int argc,wchar_t * argv [])
WinMain
,将出现在 TCHAR
-jargon as
int APIENTRY _tWinMain(HINSTANCE hInstance,HINSTANCE hPrevInstance,LPTSTR lpCmdLine,int nCmdShow)
int APIENTRY wWinMain(HINSTANCE hInstance,HINSTANCE hPrevance,LPWSTR lpCmdLine, int nCmdShow)
wchar_t
string和/或 std :: wstring
。
更多注意事项
TCHAR
时写入 sizeof(szMyString)
数组(字符串),因为对于ANSI,这是以字符和字节为单位的大小,对于Unicode,这只是以字节为单位的大小,字符数最多为一半,而对于MBCS,这是以字节为单位的大小,的字符可以相等也可以不相等。 Unicode和MBCS可以使用多个 TCHAR
来编码单个字符。
TCHAR
stuff并固定 char
或 wchar_t
很烦人;你必须将字符串从一个转换到另一个,使用正确的代码页! c> _UNICODE
和 UNICODE之间略有不同
,如果你想有条件地定义自己的函数,相关。请参见为什么选择UNICODE和_UNICODE?
TCHAR
strings, such as LPTSTR
and LPCTSTR
and how can I work with these? When I create a new project in Visual Studio it creates this code for me:#include <tchar.h>
int _tmain(int argc, _TCHAR* argv[])
{
return 0;
}
CreateFile
, CreateFileA
and CreateFileW
; so how do these differ from one another and which one should I use?TCHAR
for new Windows projects and instead directly use Unicode. On to the actual answer:Character Sets
tchar.h
, and API functions:
TCHAR = char
using ANSI encoding, where you use the standard 8-bit code page of the system for strings. All tchar.h
string routines use the basic char
versions. All API functions that work with strings will use the 'A' version of the API function.TCHAR = wchar_t
using UTF-16 encoding. All tchar.h
string routines use the wchar_t
versions. All API functions that work with strings will use the 'W' version of the API function.TCHAR = char
, using some multi-byte encoding scheme. All tchar.h
string routines use the multi-byte character set versions. All API functions that work with strings will use the 'A' version of the API function.TCHAR.h header
tchar.h
header is a helper for using generic names for the C string operations on strings, that switch to the correct function for the given character set. For instance, _tcscat
will switch to either strcat
(not set), wcscat
(unicode), or _mbscat
(mbcs). _tcslen
will switch to either strlen
(not set), wcslen
(unicode), or strlen
(mbcs)._txxx
symbols as macro's that evaluate to the correct function, depending on the compiler switches.TCHAR
(or _TCHAR
) and the encoding-agnostic functions that work on them, from tchar.h
, instead of the regular string functions from string.h
._tmain
is defined to be either main
or wmain
. See also: What is the difference between _tmain() and main() in C++?_T(..)
is defined for getting string literals of the correct type, either "regular literals"
or L"wchar_t literals"
._tmain
exampleint _tmain(int argc, _TCHAR *argv[])
{
TCHAR szCommandLine[1024];
if (argc < 2) return 0;
_tcscpy(szCommandLine, argv[1]);
for (int i = 2; i < argc; ++i)
{
_tcscat(szCommandLine, _T(" "));
_tcscat(szCommandLine, argv[i]);
}
/* szCommandLine now contains the command line arguments */
return 0;
}
TCHAR
, the tchar.h
string functions and _T
for string literals. Forgetting to surround your string literals with _T(..)
is a common source of compiler errors when writing such TCHAR
-programs.
If we had not done all these things, then switching character sets would cause the code to either not compile, or worse, compile but misbehave during runtime.Windows API functions
CreateFile
and GetCurrentDirectory
, are implemented in the Windows headers as macro's that, like the tchar.h
macro's, switch to either the 'A' version or 'W' version. For instance, CreateFile
is a macro that is defined to CreateFileA
for ANSI and MBCS, and to CreateFileW
for Unicode.int _tmain(int argc, _TCHAR *argv[])
{
if (argc < 2) return 1;
HANDLE hFile = CreateFile(argv[1], GENERIC_READ, 0, NULL, OPEN_EXISTING, 0, NULL);
/* Read from file and do other stuff */
...
CloseHandle(hFile);
return 0;
}
TCHAR
specific stuff, because the macro definitions have already taken care of this for us.Utilising C++ strings
tchar.h
routines to use C style string operations to work with TCHAR
s, but it would be nice if we could leverage C++ string
s to work with this.TCHAR
and instead use Unicode directly, see the Conclusion section, but if you want to work with TCHAR
you can do the following.TCHAR
, what we want is an instance of std::basic_string
that uses TCHAR
. You can do this by typedef
ing your own tstring
:typedef std::basic_string<TCHAR> tstring;
_T
.cin
and cout
. You can use references to implement a tcin
and tcout
:#if defined(_UNICODE)
std::wistream &tcin = wcin;
std::wostream &tcout = wcout;
#else
std::istream &tcin = cin;
std::ostream &tcout = cout;
#end
std::to_string
and std::to_wstring
, for which you can find a similar workaround.Conclusion
TCHAR
is and how it's used and intertwined with Visual Studio and the Windows headers. However, we should also wonder if we want to use it.TCHAR
at all!#include <tchar.h>
from your source file (or from stdafx.h
). Fix up any TCHAR
or _TCHAR
to wchar_t
and _tmain
to wmain
:int wmain(int argc, wchar_t *argv[])
WinMain
and will appear in TCHAR
-jargon asint APIENTRY _tWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPTSTR lpCmdLine, int nCmdShow)
int APIENTRY wWinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPWSTR lpCmdLine, int nCmdShow)
wchar_t
strings and/or std::wstring
s.Further caveats
sizeof(szMyString)
when using TCHAR
arrays (strings), because for ANSI this is the size both in characters and in bytes, for Unicode this is only the size in bytes and the number of characters is at most half, and for MBCS this is the size in bytes and the number of characters may or may not be equal. Both Unicode and MBCS can use multiple TCHAR
s to encode a single character.TCHAR
stuff and fixed char
or wchar_t
is very annoying; you have to convert the strings from one to the other, using the correct code page! A simple copy will not work in the general case._UNICODE
and UNICODE
, relevant if you want to conditionally define your own functions. See Why both UNICODE and _UNICODE?