接受非 ASCII 字符 [英] Accept non ASCII characters

查看:40
本文介绍了接受非 ASCII 字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑这个程序:

#include <stdio.h>

int main(int argc, char* argv[]) {
   printf("%s\n", argv[1]);  
   return 0;
}

我是这样编译的:

x86_64-w64-mingw32-gcc -o alpha alpha.c

问题是如果我给它一个非 ASCII 参数:

The problem is if I give it a non ASCII argument:

$ ./alpha róisín
r�is�n

我如何编写和/或编译此程序以使其接受非 ASCII人物?回应alk:不,程序打印错误.看这个例子:

How can I write and/or compile this program such that it accepts non ASCII characters? To respond to alk: no, the program is printing wrongly. See this example:

$ echo Ω | od -t x1c
0000000  ce  a9  0a
        316 251  \n
0000003

$ ./alpha Ω | od -t x1c
0000000  4f  0d  0a
          O  \r  \n
0000003

推荐答案

最简单的方法是使用 wmain:

The easiest way to do this is with wmain:

#include <fcntl.h>
#include <stdio.h>

int wmain (int argc, wchar_t** argv) {
  _setmode(_fileno(stdout), _O_WTEXT);
  wprintf(L"%s\n", argv[1]);
  return 0;
}

也可以用GetCommandLineW来完成;这是代码的简单版本在 HandBrake 存储库 中找到:

It can also be done with GetCommandLineW; here is a simple version of the code found at the HandBrake repo:

#include <stdio.h>
#include <windows.h>

int get_argv_utf8(int* argc_ptr, char*** argv_ptr) {
  int argc;
  char** argv;
  wchar_t** argv_utf16 = CommandLineToArgvW(GetCommandLineW(), &argc);
  int i;
  int offset = (argc + 1) * sizeof(char*);
  int size = offset;
  for (i = 0; i < argc; i++)
    size += WideCharToMultiByte(CP_UTF8, 0, argv_utf16[i], -1, 0, 0, 0, 0);
  argv = malloc(size);
  for (i = 0; i < argc; i++) {
    argv[i] = (char*) argv + offset;
    offset += WideCharToMultiByte(CP_UTF8, 0, argv_utf16[i], -1,
      argv[i], size-offset, 0, 0);
  }
  *argc_ptr = argc;
  *argv_ptr = argv;
  return 0;
}

int main(int argc, char** argv) {
  get_argv_utf8(&argc, &argv);
  printf("%s\n", argv[1]);
  return 0;
}

这篇关于接受非 ASCII 字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆