如何在英语操作系统上将非英语字符从char *传递给whcar_t * [英] How do I pass non-english characters from char* to whcar_t* on English OS

查看:112
本文介绍了如何在英语操作系统上将非英语字符从char *传递给whcar_t *的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

代码段:

Code Snippet:

int Convertchar_wchar(char* pData, int pDataLength)
{
    wchar_t wcsQuery[4096*2 + 2];
    memcpy((void*)wcsQuery, pData, pDataLength)
    wcout << wcsQuery << endl;
}



我正在尝试执行代码以提供多语言支持.因此,我需要在英语OS(此处为Win2K3)上处理非英语语言.现在的问题是,当我传递任何非英语字符(我曾尝试使用日语)而不是传递非英语字符时,它会将其转换为?????.我已经确认它没有显示问题,但是当调用memcpy时,里面的值正在改变.通过char * pData传递的值采用UNICODE值的形式,但仍转换为错误的值.

有人可以帮助我理解memcpy为什么转换值吗? memcpy是否在内部使用默认代码页值?如何将正确的值传递给宽字符指针?


我已经尝试过wcscpy,RtlCopyMemory.我不确定如果使用MultiByteToWideChar来支持所有语言,该传递哪个代码页.

尽快等待输入.

谢谢



I am trying to execute the code for multilingual support. Hence I need to handle non-english languages on English OS(here Win2K3). Now the problem is when I pass any non english characters(I have tried with Japanese) instead of passing the non-english characters it is converting it to ?????. I have confirmed its not display problem but when memcpy is called the value inside is getting changed. The value being passed through char* pData is in the form of UNICODE value, still it is converting into wrong values.

Can some one help me to understand why memcpy is converting the value? Does memcpy internally uses default code page value? How can I pass the correct value to the wide char pointer?


I have already tried wcscpy, RtlCopyMemory. I am not sure what Code Page to pass if MultiByteToWideChar is used such that it will support all the languages.

Waiting for some input ASAP.

Thanks

推荐答案

您的起点似乎不正确;您不能在定义为char* pData的数组中包含Unicode字符. (最好)是多字节数据,因此对wchar_t数组进行直接memcpy()运算仍会留下多字节字符.正如超人所说,您需要将其转换为Unicode(假设这就是您要实现的目标).

但是我怀疑基本问题是您使用wcout来显示字符.此流接受Unicode字符;但是,您没有传递Unicode字符,因此您的数据被转换为垃圾.请尝试以下操作:
Your starting point seems incorrect; you cannot have Unicode characters in an array defined as char* pData. This is (at best) multibyte data so doing a straight memcpy() to a wchar_t array still leaves you with multibyte characters. As Superman mentioned you need to convert it to Unicode (assuming that is what you are trying to achieve).

However I suspect the basic problem is your use of wcout to display the characters. This stream accepts Unicode characters; however you are not passing Unicode characters so your data gets converted to garbage. Try the following:
int Convertchar_wchar(char* pData, int pDataLength)
{
    cout << pData << endl;
}

// or if you want to do the conversion

int Convertchar_wchar(char* pData, int pDataLength)
{
    wchar_t wcsQuery = new wchar_t[pDataLength + 1];
    MultiByteToWideChar(CP_UTF8, 0, pData, pDataLength, wcsQuery, pDataLength + 1);
    wcout << wcsQuery << endl;
    delete [] wcsQuery; // Don''t forget to deallocate the buffer!
}


如果要处理UNICODE字符,则应从一开始就使用wchar_t缓冲区.您不需要进行转换.
如果无法避免,则应执行此操作-
If you''re dealing with UNICODE characters, you should be using wchar_t buffers from the very beginning itself. You shouldn''t have the need to make a conversion.
If it cannot be avoided you should do this -
MultiByteToWideChar(CP_UTF8, 0, charArray, -1, wcharArray, wcharArrayLen);


问题在于它是我需要增强的遗留代码.实际上,数据来自在线上来自Java应用程序的unicode.

直到现在,unicode才能很好地适用于日语OS上的日语数据或中文OS上的中文数据.我面临的问题是在英语操作系统中支持这种语言.据我了解,memcpy应该只复制数据,而无论它是哪种数据都没有关系.但是我的理解实际上是错误的.

因此,我需要知道,memcpy是否在内部使用默认代码页值?

同样,通过网络传输的数据也不会始终是UTF8.如果我需要事先确定数据类型,您能帮我确定一下吗?
The thing is that its a legacy code which i need to enhance. The data is actually coming was unicode from a java application over the wire.

Till now unicode work good for Japanese data on Japanese OS or Chinese data on Chinese OS. The problem I am facing is to support this languages in English OS. As far as i understand memcpy should just copy the data and it should not matter what kind of data it is. But my understanding stood wrong practically.

Hence i need to know, does memcpy internally uses default code page value?

Also, the data coming over the wire will not be UTF8 always. In case i will need to determine before hand the type of data, can you help how i can determine that?


这篇关于如何在英语操作系统上将非英语字符从char *传递给whcar_t *的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆