在Visual C ++ 2013中,Shift-JIS解码失败使用wifstrem [英] Shift-JIS decoding fails using wifstrem in Visual C++ 2013

查看:224
本文介绍了在Visual C ++ 2013中,Shift-JIS解码失败使用wifstrem的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用std :: wifstream和std :: getline读取使用Shift-JIS(cp 932)编码的文本文件。下面的代码在VS2010中工作但在VS2013失败:

I am trying to read a text file encoded in Shift-JIS (cp 932) using std::wifstream, and std::getline. The following code works in VS2010 but fails in VS2013:

std::wifstream in;
in.open("data932.txt");

const std::locale locale(".932");

in.imbue(locale);

std::wstring line1, line2;
std::getline(in, line1);
std::getline(in, line2);
const bool good = in.good();

文件包含几行,第一行只包含ASCII字符,第二行是日语脚本。因此,当此代码段运行时, line1 应包含ASCII行, line2 日语脚本和 good 应为true。

The file contains several lines, where the first line contains just ASCII characters, and the second is Japanese script. Thus, when this snippet runs, line1 should contain the ASCII line, line2 the Japanese script, and good should be true.

在VS2010中编译时,结果如预期。但是在VS2013编译时, line1 包含ASCII行,但 line2 为空,

When compiled in VS2010, the result is as expected. But when compiled in VS2013, line1 contains the ASCII line, but line2 is empty, and good is false.

我调试到CRT,(因为Visual Studio提供的源代码),并发现一个内部函数调用 _Mbrtowc (在文件xmbtowc.c中)在两个版本之间修改,并且它们用于检测双字节字符的前导字节的方式已更改, VS 2013未能检测到引导字节,因此无法解码字节流。

I debugged into the CRT, (as the source is provided with Visual Studio), and found that an internal function called _Mbrtowc (in file xmbtowc.c) was modified between the two versions, and the way they use to detect a lead byte of a double byte character was changed, and the one in VS 2013 fails to detect a lead byte, thus fails to decode the byte stream.

进一步调试显示了一个点,其中 _Cvtvec 对象的 _Isleadbyte 数组被初始化(在文件xwctomb.c中的函数 _Getcvt()并且初始化产生错误的结果。它似乎总是使用代码页1252,这是我的系统上的默认代码页,而不是为正在使用的流设置的932。但是,我不能决定是否是设计,我错过了一些必要的步骤,以获得一个好的结果,或者这确实是VS2013的CRT中的错误。

Further debugging revealed a point, where a _Cvtvec object's _Isleadbyte array is initialized (in the function _Getcvt(), in file xwctomb.c), and that initialization produces a wrong result. It seems that it always uses code page 1252, which is the default code page on my system, and not 932 which is set for the stream in use. However, I could not decide if it is by design, and I missed some required steps to get a good result, or this is indeed a bug in the CRT for VS2013.

欢迎对此主题的任何见解!

Any insights on this topic are welcome!

推荐答案

我找到了一个解决方法:如果为区域设置的创建我显式更改全局MBC代码页,语言环境被正确初始化,并按预期解码。

I have found a workaround: if for the creation of the locale I explicitly change the global MBC code page, the locale is initialized correctly, and the lines are read and decoded as expected.

const int oldMbcp = _getmbcp();
_setmbcp(932);
const std::locale locale("Japanese_Japan.932");
_setmbcp(oldMbcp);

这篇关于在Visual C ++ 2013中,Shift-JIS解码失败使用wifstrem的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆