NTFS 中的文件名以什么编码存储? [英] What encoding are filenames in NTFS stored as?
问题描述
我刚刚开始进行一些编程,以在 WinXP 系统上处理具有非英文名称的文件名.我已经完成了一些关于 unicode 的推荐阅读,我想我已经了解了基本概念,但有些部分对我来说仍然不是很清楚.
I'm just getting started on some programming to handle filenames with non-english names on a WinXP system. I've done some recommended reading on unicode and I think I get the basic idea, but some parts are still not very clear to me.
具体来说,NTFS 中存储的文件名称(不是内容,而是文件的实际名称)是什么编码(UTF-8、UTF-16LE/BE)?是否可以使用 fopen() 打开任何文件,它采用 char*,或者我别无选择,只能使用 wfopen(),它使用 wchar_t*,并且可能采用 UTF-16 字符串?
Specifically, what encoding (UTF-8, UTF-16LE/BE) are the file names (not the content, but the actual name of the file) stored in NTFS? Is it possible to open any file using fopen(), which takes a char*, or do I have no choice but to use wfopen(), which uses a wchar_t*, and presumably takes a UTF-16 string?
我尝试手动输入 UTF-8 编码的字符串到 fopen(),例如.
I tried manually feeding in a UTF-8 encoded string to fopen(), eg.
unsigned char filename[] = {0xEA, 0xB0, 0x80, 0x2E, 0x74, 0x78, 0x74, 0x0}; // 가.txt
FILE* f = fopen((char*)filename, "wb+");
但结果是'ê°€.txt'.
but this came out as 'ê°€.txt'.
我的印象是(可能是错误的)UTF8 编码的字符串足以在 Windows 下打开任何文件名,因为我似乎模糊地记得一些 Windows 应用程序传递 (char*),而不是 (wchar_t*),并且没有问题.
I was under the impression (which may be wrong) that a UTF8-encoded string would suffice in opening any filename under Windows, because I seem to vaguely remember some Windows application passing around (char*), not (wchar_t*), and having no problems.
有人能解释一下吗?
推荐答案
NTFS 以 UTF-16 格式存储文件名,但是 fopen
使用的是 ANSI(而非 UTF-8).
NTFS stores filenames in UTF-16, however fopen
is using ANSI (not UTF-8).
为了使用 UTF16 编码的文件名,您需要使用文件打开调用的 Unicode 版本.通过在项目中定义 UNICODE
和 _UNICODE
来实现.然后使用 CreateFile
调用或 wfopen
调用.
In order to use an UTF16-encoded file name you will need to use the Unicode versions of the file open calls. Do this by defining UNICODE
and _UNICODE
in your project. Then use the CreateFile
call or the wfopen
call.
这篇关于NTFS 中的文件名以什么编码存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!