通过c ++ main char ** args处理不同字符串编码的正确方法是什么? [英] What is the correct way of processing different strings encodings via c++ main char** args?
问题描述
我需要澄清.
问题是我有一个用C ++编写的Windows程序,该程序使用特定于Windows的'wmain'函数,该函数接受wchar_t **作为其args.因此,有机会将任何您喜欢的参数作为命令行参数传递给此类程序:例如,中文符号,日语符号等.
说实话,我没有有关此函数通常使用的编码的信息.大概是utf-32,甚至是utf-16.所以,问题是:
-
什么不是Windows特定的,而是用标准主要功能来实现此目的的unix/linux方法?我的第一个想法是关于使用utf-8编码的输入字符串并以某种语言环境指定?
-
有人可以举一个这样的主要功能的简单例子吗?std :: string如何容纳一个中文符号?
- 当我们像这样访问每个字符(字节)时,是否可以像往常一样使用以utf-8编码并包含在std :: strings中的中文符号进行操作?
免责声明: 用法: 输出: 3)否 对于UTF-8/UTF-16数据,我们需要使用特殊的库,例如 ICU > 对于逐字符处理,您需要使用或转换为UTF-32. I need some clarifications. The problem is I have a program for windows written in C++ which uses 'wmain' windows-specific function that accepts wchar_t** as its args. So, there is an opportunity to pass whatever-you-like as a command line parameters to such program: for example, Chinese symbols, Japanese ones, etc, etc. To be honest, I have no information about the encoding this function is usually used with. Probably utf-32, or even utf-16.
So, the questions: What is the not windows-specific, but unix/linux way to achieve this with standard main function? My first thoughts were about usage of utf-8 encoded input strings with some kind of locales specifying? Can somebody give a simple example of such main function? How can a std::string hold a Chinese symbols?
./program-开关-选项wibble
开关:上选项:摆动
Disclaimer: All Chinese words provided by GOOGLE translate service.
1) Just proceed as normal using normal std::string
. The std::string
can hold any character encoding and argument processing is simple pattern matching. So on a Chinese computer with the Chinese version of the program installed all it needs to do is compare Chinese versions of the flags to what the user inputs.
2) For example:
#include <string>
#include <vector>
#include <iostream>
std::string arg_switch = "开关";
std::string arg_option = "选项";
std::string arg_option_error = "缺少参数选项";
int main(int argc, char* argv[])
{
const std::vector<std::string> args(argv + 1, argv + argc);
bool do_switch = false;
std::string option;
for(auto arg = args.begin(); arg != args.end(); ++arg)
{
if(*arg == "--" + arg_switch)
do_switch = true;
else if(*arg == "--" + arg_option)
{
if(++arg == args.end())
{
// option needs a value - not found
std::cout << arg_option_error << '\n';
return 1;
}
option = *arg;
}
}
std::cout << arg_switch << ": " << (do_switch ? "on":"off") << '\n';
std::cout << arg_option << ": " << option << '\n';
return 0;
}
Usage:
./program --开关 --选项 wibble
Output:
开关: on
选项: wibble
3) No.
For UTF-8/UTF-16 data we need to use special libraries like ICU
For character by character processing you need to use or convert to UTF-32.
这篇关于通过c ++ main char ** args处理不同字符串编码的正确方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!