通过char读取UTF-16 CSV文件 [英] Reading a UTF-16 CSV file by char

查看:348
本文介绍了通过char读取UTF-16 CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我想通过char读取一个UTF-16编码的CSV文件char,并将每个字符转换为ascii,以便我可以处理它。我后来计划把我处理的数据改回UTF-16,但这是除了现在的点。

Currently I am trying to read a UTF-16 encoded CSV file char by char, and convert each char into ascii so I can process it. I later plan to change my processed data back to UTF-16 but that is besides the point right now.

我知道我的蝙蝠我做的完全错了,因为我从来没有尝试过这样的事情:

I know right off the bat I am doing this completely wrong, as I have never attempted anything like this before:

int main(void)
{
    FILE *fp;
    int ch;
    if(!(fp = fopen("x.csv", "r"))) return 1;
    while(ch != EOF)
    {
        ch = fgetc(fp);
                ch = (wchar_t) ch;
                ch = (char) ch;
        printf("%c", ch);
    }
    fclose(fp);
    return 0;
}

想想,我希望这种工作的魔法有一些原因,但不是这样的。如何读取UTF-16 CSV文件并将其转换为ASCII?我的猜测是,因为每个utf-16字符是两个字节(我想?)我将不得不从文件读取两个字节到一个数据类型的变量,我不知道。然后我想我将不得不检查这个变量的位,以确保它是有效的ascii和转换它从那里?

Wishfully thinking, I was hoping that that work by magic for some reason but that was not the case. How can I read a UTF-16 CSV file and convert it to ascii? My guess is since each utf-16 char is two bytes (i think?) I'm going to have to read two bytes at a time from the file into a variable of some datatype which I am not sure of. Then I guess I will have to check the bits of this variable to make sure it is valid ascii and convert it from there? I don't know how I would do this though and any help would be great.

推荐答案

您应该使用 fgetwc 。以下代码应在存在字节顺序标记的情况下运行,并且可用的区域设置名为 en_US.UTF-16

You should use fgetwc. The below code should work in the presence of a byte-order mark, and an available locale named en_US.UTF-16.

#include <stdio.h>
#include <wchar.h>
#include <locale.h>

main() {
  setlocale(LC_ALL, "en_US.UTF-16"); 

  FILE *fp = fopen("x.csv", "rb");
  if (fp) {
    int order = fgetc(fp) == 0xFE;
    order = fgetc(fp) == 0xFF;

    wint_t ch;
    while ((ch = fgetwc(fp)) != WEOF) {
      putchar(order ? ch >> 8 : ch);
    }
    putchar('\n');

    fclose(fp);
    return 0;
  } else {
    perror("opening x.csv");
    return 1;
  }
}

这篇关于通过char读取UTF-16 CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆