用FREAD()读取文本文件到缓冲区中 - 为什么在缓冲区不是每个角色各自的ASCII码值的值? [英] Using fread() to read text file into a buffer - why are the values in the buffer not each character's respective ASCII value?

查看:297
本文介绍了用FREAD()读取文本文件到缓冲区中 - 为什么在缓冲区不是每个角色各自的ASCII码值的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,这是不作业。只是想了解为什么我看到什么,我看到我的屏幕上。

First off, this isn't homework. Just trying to understand why I'm seeing what I'm seeing on my screen.

下面的东西(我自己的工作)目前需要一个输入文件,并把它读成一个二进制文件。我希望它在存储阵列读取每个字节(供以后使用)。为了简洁起见,输入文件(hello.txt的)只包含'的Hello World,没有撇号

The stuff below (my own work) currently takes an input file and reads it as a binary file. I want it to store each byte read in an array (for later use). For the sake of brevity the input file (Hello.txt) just contains 'Hello World', without the apostrophes.

int main(int argc, char *argv[]) {

    FILE *input;
    int i, size;
    int *array;

    input = fopen("Hello.txt", "rb");
    if (input == NULL) {
        perror("Invalid file specified.");
        exit(-1);
    }

    fseek(input, 0, SEEK_END);
    size = ftell(input);
    fseek(input, 0, SEEK_SET);

    array = (int*) malloc(size * sizeof(int));
    if (array == NULL) {
        perror("Could not allocate array.");
        exit(-1);
    }
    else {
        input = fopen("Hello.txt", "rb");
        fread(array, sizeof(int), size, input);
        // some check on return value of fread?
        fclose(input);
    }

    for (i = 0; i < size; i++) {
        printf("array[%d] == %d\n", i, array[i]);
    }

为什么是它具有的打印语句for循环,因为它上面使输出看起来像这样

Why is it that having the print statement in the for loop as it is above causes the output to look like this

array[0] == 1819043144
array[1] == 1867980911
array[2] == 6581362
array[3] == 0
array[4] == 0
array[5] == 0
array[6] == 0
array[7] == 0
array[8] == 0
array[9] == 0
array[10] == 0

虽然具有像这样

printf("array[%d] == %d\n", i, ((char *)array)[i]);

使输出看起来就像这样(每个字符的十进制ASCII值)

makes the output look like this (decimal ASCII value for each character)

array[0] == 72
array[1] == 101
array[2] == 108
array[3] == 108
array[4] == 111
array[5] == 32
array[6] == 87
array[7] == 111
array[8] == 114
array[9] == 108
array[10] == 100

?如果我读它作为一个二进制文件,并希望通过字节读取字节,我为什么不使用第一个print语句得到正确的ASCII值?

? If I'm reading it as a binary file and want to read byte by byte, why don't I get the right ASCII value using the first print statement?

在一个相关的说明,如果输入文件我发不是一个文本文档发生了什么(例如,JPEG)?

On a related note, what happens if the input file I send in isn't a text document (e.g., jpeg)?

对不起是这完全是小事,但我似乎无法弄清楚的为什么

Sorry is this is an entirely trivial matter, but I can't seem to figure out why.

推荐答案

行为也就不足为奇了:


  • 您有一个包含11个字符的文件。 的sizeof(char)的 1。

  • 现在,你有11 INT分配 INT 的数组。 的sizeof(INT)很可能是4你的机器上

  • 您指示 FREAD 读取多达11 INT S(最多44个字节)。因此,前4个字符将被解读为 INT 并保存在数组[0] 并在未来4 数组[1]

    • 如果您选中了 FREAD的回报 它会告诉你,这实际上是只读的2个元素(如内容是11字节它只能读取2 INT 和最后剩下的3个字节不能顺利读为 INT )。

    • You have a file containing 11 characters. sizeof(char) is 1.
    • Now you allocate an array of int with 11 int. sizeof(int) is very likely to be 4 on your machine
    • You instruct fread to read up to 11 ints (up to 44 bytes). So the first 4 characters will be read as an int and stored in array[0] and the next 4 in array[1].
      • If you had checked the return of fread it would tell you that it actually only read 2 elements (as the content is 11 bytes it can only read 2 ints and the last 3 remaining bytes cannot be successfully read as an int).

      内存布​​局基本上是这样的:

      The memory layout basically looks like this:

      array[0]
      |       array[1]
      |       |
      1 2 3 4 5 6 7 8 9 10 11
      | |
      | ((char *)array)[1]
      ((char *)array)[0]
      

      这篇关于用FREAD()读取文本文件到缓冲区中 - 为什么在缓冲区不是每个角色各自的ASCII码值的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆