如何在Linux中使用POSIX方法从文件读取Unicode-16字符串? [英] How do I read Unicode-16 strings from a file using POSIX methods in Linux?

查看：235 发布时间：2020/5/29 18:45:05 windows linux unicode posix wchar-t

本文介绍了如何在Linux中使用POSIX方法从文件读取Unicode-16字符串?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含UNICODE-16字符串的文件，我想将其读入Linux程序.这些字符串是从Windows的内部WCHAR格式原始写入的. (Windows是否始终使用UTF-16?例如日语版本)

I have a file containing UNICODE-16 strings that I would like to read into a Linux program. The strings were written raw from Windows' internal WCHAR format. (Does Windows always use UTF-16? e.g. in Japanese versions)

我相信我可以使用原始读取以及使用wcstombs_l进行转换来读取它们.但是，我不知道要使用什么语言环境.在最新的Ubuntu和Mac OS X计算机上运行"locale -a"会产生名称为utf-16的零个语言环境.

I believe that I can read them using raw reads and the converting with wcstombs_l. However, I cannot figure what locale to use. Runing "locale -a" on my up-to-date Ubuntu and Mac OS X machines yields zero locales with utf-16 in their names.

有更好的方法吗?

更新:正确的答案和下面的其他内容帮助我指出了使用libiconv的方法.这是我用来进行转换的函数.我目前在一个类中将其转换为单行代码.

Update: the correct answer and others below helped point me to using libiconv. Here's a function I'm using to do the conversion. I currently have it inside a class that makes the conversions into a one-line piece of code.

// Function for converting wchar_t* to char*. (Really: UTF-16LE --> UTF-8)
// It will allocate the space needed for dest. The caller is
// responsible for freeing the memory.
static int iwcstombs_alloc(char **dest, const wchar_t *src)
{
  iconv_t cd;
  const char from[] = "UTF-16LE";
  const char to[] = "UTF-8";

  cd = iconv_open(to, from);
  if (cd == (iconv_t)-1)
  {
    printf("iconv_open(\"%s\", \"%s\") failed: %s\n",
           to, from, strerror(errno));
    return(-1);
  }

  // How much space do we need?
  // Guess that we need the same amount of space as used by src.
  // TODO: There should be a while loop around this whole process
  //       that detects insufficient memory space and reallocates
  //       more space.
  int len = sizeof(wchar_t) * (wcslen(src) + 1);

  //printf("len = %d\n", len);

  // Allocate space
  int destLen = len * sizeof(char);
  *dest = (char *)malloc(destLen);
  if (*dest == NULL)
  {
    iconv_close(cd);
    return -1;
  }

  // Convert

  size_t inBufBytesLeft = len;
  char *inBuf = (char *)src;
  size_t outBufBytesLeft = destLen;
  char *outBuf = (char *)*dest;

  int rc = iconv(cd,
                 &inBuf,
                 &inBufBytesLeft,
                 &outBuf,
                 &outBufBytesLeft);
  if (rc == -1)
  {
    printf("iconv() failed: %s\n", strerror(errno));
    iconv_close(cd);
    free(*dest);
    *dest = NULL;
    return -1;
  }

  iconv_close(cd);

  return 0;
} // iwcstombs_alloc()

如何在Linux中使用POSIX方法从文件读取Unicode-16字符串? [英] How do I read Unicode-16 strings from a file using POSIX methods in Linux?

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

如何在Linux中使用POSIX方法从文件读取Unicode-16字符串? [英] How do I read Unicode-16 strings from a file using POSIX methods in Linux?

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭