Unicode字符的小写 [英] Lowercase of Unicode character

查看:54
本文介绍了Unicode字符的小写的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在一个需要从 unicode文本获取数据的 C ++ 项目.我有一个问题,我不能降低某些 unicode字符.我使用 wchar_t 来存储从unicode文件读取的unicode字符.之后,我使用 _wcslwr 降低了 wchar_t 字符串.还有很多情况还不低,例如:

I am working on a C++ project that need to get data from unicode text. I have a problem that I can't lower some unicode character. I use wchar_t to store unicode character which read from a unicode file. After that, I use _wcslwr to lower a wchar_t string. There are many case still not lower such as:

Đ Â Ă Ê Ô Ơ Ư Ấ Ắ Ế Ố Ớ Ứ Ầ Ằ Ề Ồ Ờ Ừ Ậ Ặ Ệ Ộ Ợ Ự

小写为:

đ â ă ê ô ơ ư ấ ắ ế ố ớ ứ ầ ằ ề ồ ờ ừ ậ ặ ệ ộ ợ ự 

我尝试了 tolower ,但仍然无法正常工作.

I have try tolower and it is still not working.

推荐答案

如果仅调用 tolower ,它将从标头 clocale中调用 std :: tolower ,它将仅对ansi字符调用 lower .

If you call only tolower, it will call std::tolower from header clocale which will call the tolower for ansi character only.

正确的签名应该是:

template< class charT >
charT tolower( charT ch, const locale& loc );

以下是两个版本,效果很好:

Here below is 2 versions which works well:

#include <iostream>
#include <cwctype>
#include <clocale>
#include <algorithm>
#include <locale>

int main() {
    std::setlocale(LC_ALL, "");
    std::wstring data = L"Đ Â Ă Ê Ô Ơ Ư Ấ Ắ Ế Ố Ớ Ứ Ầ Ằ Ề Ồ Ờ Ừ Ậ Ặ Ệ Ộ Ợ Ự";
    std::wcout << data << std::endl;

    // C std::towlower
    for(auto c: data)
    {
        std::wcout << static_cast<wchar_t>(std::towlower(c));
    }
    std::wcout << std::endl;

    // C++ std::tolower(charT, std::locale)
    std::locale loc("");
    for(auto c: data)
    {
        // This is recommended
        std::wcout << std::tolower(c, loc);
    }
    std::wcout << std::endl;
    return 0;
}

参考:

tolower

这篇关于Unicode字符的小写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆