C ++降低特殊字符如ü [英] C++ tolower on special characters such as ü

查看:194
本文介绍了C ++降低特殊字符如ü的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用C ++中的tolower()函数将字符串转换为小写时遇到麻烦。使用普通字符串,它可以按预期工作,但是特殊字符无法成功转换。

I have trouble transforming a string to lowercase with the tolower() function in C++. With normal strings, it works as expected, however special characters are not converted successfully.

我如何使用函数:

string NotLowerCase = "Grüßen";
string LowerCase = "";
for (unsigned int i = 0; i < NotLowerCase.length(); i++) {
    LowerCase += tolower(NotLowerCase[i]);
    }

例如:


  1. 测试->测试

  2. TeST2-> test2

  3. Grüßen-> gr ???? en

  4. (§)->()

  1. Test -> test
  2. TeST2 -> test2
  3. Grüßen -> gr????en
  4. (§) -> ()

3和4不是可以按预期的方式工作

3 and 4 are not working as expected as you can see

如何解决此问题?我必须保留特殊字符,但要小写。

How can I fix this issue? I have to keep the special chars, but as lowercase.

推荐答案

tolower 显示了如何解决此问题;

The sample code (below) from tolower shows how you fix this; you have to use something other than the default "C" locale.

#include <iostream>
#include <cctype>
#include <clocale>

int main()
{
    unsigned char c = '\xb4'; // the character Ž in ISO-8859-15
                              // but ´ (acute accent) in ISO-8859-1 

    std::setlocale(LC_ALL, "en_US.iso88591");
    std::cout << std::hex << std::showbase;
    std::cout << "in iso8859-1, tolower('0xb4') gives "
              << std::tolower(c) << '\n';
    std::setlocale(LC_ALL, "en_US.iso885915");
    std::cout << "in iso8859-15, tolower('0xb4') gives "
              << std::tolower(c) << '\n';
}

您还可以更改 std :: string std :: wstring ,在许多C ++实现中都是Unicode。

You might also change std::string to std::wstring which is Unicode on many C++ implementations.

wstring NotLowerCase = L"Grüßen";
wstring LowerCase;
for (auto&& ch : NotLowerCase) {
    LowerCase += towlower(ch);
    }

Microsoft的指导是 将字符串标准化为大写,因此您可以使用 toupper 改为 towupper

Guidance from Microsoft is to "Normalize strings to uppercase", so you might use toupper or towupper instead.

请记住,在某些语言中,逐字符转换可能效果不佳。例如,使用德语作为德语,使Grüßen全部大写即可将其变成GRÜESSEN(尽管现在有一个大写字母 )。还有许多其他问题,例如组合字符。如果您要使用字符串进行真正的生产工作,那么您真的想要一种完全不同的方法。

Keep in mind that a character-by-character transformation might not work well for some languages. For example, using German as spoken in Germany, making Grüßen all upper-case turns it into GRÜESSEN (although there is now a capital ). There are numerous other "problems" such a combining characters; if you're doing real "production" work with strings, you really want a completely different approach.

最后,C ++对管理语言环境提供了更完善的支持,请参见 < locale>

Finally, C++ has more sophisticated support for managing locales, see <locale> for details.

这篇关于C ++降低特殊字符如ü的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆