通过c ++ main char ** args处理不同字符串编码的正确方法是什么? [英] What is the correct way of processing different strings encodings via c++ main char** args?

查看:39
本文介绍了通过c ++ main char ** args处理不同字符串编码的正确方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要澄清.

问题是我有一个用C ++编写的Windows程序,该程序使用特定于Windows的'wmain'函数,该函数接受wchar_t **作为其args.因此,有机会将任何您喜欢的参数作为命令行参数传递给此类程序:例如,中文符号,日语符号等.

说实话,我没有有关此函数通常使用的编码的信息.大概是utf-32,甚至是utf-16.所以,问题是:

  • 什么不是Windows特定的,而是用标准主要功能来实现此目的的unix/linux方法?我的第一个想法是关于使用utf-8编码的输入字符串并以某种语言环境指定?

  • 有人可以举一个这样的主要功能的简单例子吗?std :: string如何容纳一个中文符号?

  • 当我们像这样访问每个字符(字节)时,是否可以像往常一样使用以utf-8编码并包含在std :: strings中的中文符号进行操作?

解决方案

免责声明:

用法:

  ./program-开关-选项wibble 

输出:

 开关:上选项:摆动 

3)否

对于UTF-8/UTF-16数据,我们需要使用特殊的库,例如 ICU

对于逐字符处理,您需要使用或转换为UTF-32.

I need some clarifications.

The problem is I have a program for windows written in C++ which uses 'wmain' windows-specific function that accepts wchar_t** as its args. So, there is an opportunity to pass whatever-you-like as a command line parameters to such program: for example, Chinese symbols, Japanese ones, etc, etc.

To be honest, I have no information about the encoding this function is usually used with. Probably utf-32, or even utf-16. So, the questions:

  • What is the not windows-specific, but unix/linux way to achieve this with standard main function? My first thoughts were about usage of utf-8 encoded input strings with some kind of locales specifying?

  • Can somebody give a simple example of such main function? How can a std::string hold a Chinese symbols?

  • Can we operate with Chinese symbols encoded in utf-8 and contained in std::strings as usual when we just access each char (byte) like this: string_object[i] ?

解决方案

Disclaimer: All Chinese words provided by GOOGLE translate service.

1) Just proceed as normal using normal std::string. The std::string can hold any character encoding and argument processing is simple pattern matching. So on a Chinese computer with the Chinese version of the program installed all it needs to do is compare Chinese versions of the flags to what the user inputs.

2) For example:

#include <string>
#include <vector>
#include <iostream>

std::string arg_switch = "开关";
std::string arg_option = "选项";
std::string arg_option_error = "缺少参数选项";

int main(int argc, char* argv[])
{
    const std::vector<std::string> args(argv + 1, argv + argc);

    bool do_switch = false;
    std::string option;

    for(auto arg = args.begin(); arg != args.end(); ++arg)
    {
        if(*arg == "--" + arg_switch)
            do_switch = true;
        else if(*arg == "--" + arg_option)
        {
            if(++arg == args.end())
            {
                // option needs a value - not found
                std::cout << arg_option_error << '\n';
                return 1;
            }
            option = *arg;
        }
    }

    std::cout << arg_switch << ": " << (do_switch ? "on":"off") << '\n';
    std::cout << arg_option << ": " << option << '\n';

    return 0;
}

Usage:

./program --开关 --选项 wibble

Output:

开关: on
选项: wibble

3) No.

For UTF-8/UTF-16 data we need to use special libraries like ICU

For character by character processing you need to use or convert to UTF-32.

这篇关于通过c ++ main char ** args处理不同字符串编码的正确方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆