从文本文件用C读数值流 [英] Reading a stream of values from text file in C

查看:149
本文介绍了从文本文件用C读数值流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有它可含有一个或高达400号的文本文件。每个数字用逗号分开,分号用来表示数字流的末尾。
目前,我正在读通过与fgets行的文本文件行。出于这个原因,我用1024元(每行的最大字符的文本文件)的固定阵列。
这不是如何实现这一点,因为如果只有一个号码是在文本文件中输入的理想方式,1024元素的数组,我们将毫无意义。
有没有办法使用与fgets用malloc函数(或任何其他方法)来提高内存效率的方式?

I have a text file which may contain one or up to 400 numbers. Each number is separated by a comma and a semicolon is used to indicate end of numbers stream. At the moment I am reading the text file line by line using the fgets. For this reason I am using a fixed array of 1024 elements (the maximum characters per line for a text file). This is not the ideal way how to implement this since if only one number is inputted in the text file, an array of 1024 elements will we pointless. Is there a way to use fgets with the malloc function (or any other method) to increase memory efficiency?

推荐答案

为你的数据动态分配空间是在C的工作你还不如付出的代价学习的基本工具。要记住的主要一点是,

Dynamically allocating space for you data is a fundamental tool for working in C. You might as well pay the price to learn. The primary thing to remember is,

如果你分配内存,你有责任跟踪其使用
  和preserve的指针的起始地址的块
  内存,使您可以释放它,当你用它做。否则,你的
  code。与内存泄漏像筛子。

"if you allocate memory, you have the responsibility to track its use and preserve a pointer to the starting address for the block of memory so you can free it when you are done with it. Otherwise your code with leak memory like a sieve."

动态分配是直线前进。您分配的内存一些初始块跟踪你添加到它的东西。必须测试每个分配成功。您必须测试多少您使用的内存块并重新分配或停止写入数据时完全超出你的内存块的最后prevent写作。如果你失败了测试要么,你会损坏记忆与code有关。

Dynamic allocation is straight forward. You allocate some initial block of memory and keep track of what you add to it. You must test that each allocation succeeds. You must test how much of the block of memory you use and reallocate or stop writing data when full to prevent writing beyond the end of your block of memory. If you fail to test either, you will corrupt the memory associated with your code.

当您重新分配,总是因为有一个重新分配失败重新分配使用临时指针,内存的原始块被释放。 (导致该块中的所有previous数据丢失)。使用临时指针,您可以处理失败的方式,以preserve阻断如果需要的话。

When you reallocate, always reallocate using a temporary pointer because with a reallocation failure, the original block of memory is freed. (causing loss of all previous data in that block). Using a temporary pointer allows you to handle failure in a manner to preserve that block if needed.

考虑到这,下面我们最初分配空间为64 值(你可以很容易地更改为code处理任何类型,例如 INT 浮动双击 ...)。在code,那么读取数据(使用函数getline 来动态分配每行的缓冲区)的每一行。 与strtol 用来解析缓冲区的阵列赋值。 IDX 用作指标来跟踪有多少值被读取,当 IDX 到达电流<因为previously是和 n最大 n最大,阵列被重新分配两倍大code>更新以反映更改。读出,解析,检查和重新分配持续的文件中的数据的每一行。完成后,该值将被打印到标准输出,显示来自格式为测试文件353394257,... 293,58,135阅读400随机值;

Taking that into consideration, below we initially allocate space for 64 long values (you can easily change to code to handle any type, e.g. int, float, double...). The code then reads each line of data (using getline to dynamically allocate the buffer for each line). strtol is used to parse the buffer assigning values to the array. idx is used as an index to keep track of how many values have been read, and when idx reaches the current nmax, array is reallocated twice as large as it previously was and nmax is updated to reflect the change. The reading, parsing, checking and reallocating continues for every line of data in the file. When done, the values are printed to stdout, showing the 400 random values read from the test file formatted as 353,394,257,...293,58,135;

要保持读取循环的逻辑干净,我已经​​把错误检查的与strtol 转化为函数 xstrtol ,但你可以自由地包括的main()如果你喜欢。这同样适用于 realloc_long 功能。看的时候重新分配发生时,你可以编译code与 -DDEBUG 定义。例如:

To keep the read loop logic clean, I've put the error checking for the strtol conversion into a function xstrtol, but you are free to include that code in main() if you like. The same applies to the realloc_long function. To see when the reallocation takes place, you can compile the code with the -DDEBUG definition. E.g:

gcc -Wall -Wextra -DDEBUG -o progname yoursourcefile.c

程序期望您的数据文件名作为第一个参数,你可以提供一个可选的转换基地作为第二个参数(默认为10)。例如:

The program expects your data filename as the first argument and you can provide an optional conversion base as the second argument (default is 10). E.g.:

./progname datafile.txt [base (default: 10)]

查看它,测试它,让我知道,如果你有任何问题。

Look over it, test it, and let me know if you have any questions.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>

#define NMAX 64

long xstrtol (char *p, char **ep, int base);
long *realloc_long (long *lp, unsigned long *n);

int main (int argc, char **argv)
{

    char *ln = NULL;                /* NULL forces getline to allocate  */
    size_t n = 0;                   /* max chars to read (0 - no limit) */
    ssize_t nchr = 0;               /* number of chars actually read    */
    size_t idx = 0;                 /* array index counter              */
    long *array = NULL;             /* pointer to long                  */
    unsigned long nmax = NMAX;      /* initial reallocation counter     */
    FILE *fp = NULL;                /* input file pointer               */
    int base = argc > 2 ? atoi (argv[2]) : 10; /* base (default: 10)    */

    /* open / validate file */
    if (!(fp = fopen (argv[1], "r"))) {
        fprintf (stderr, "error: file open failed '%s'.", argv[1]);
        return 1;
    }

    /* allocate array of NMAX long using calloc to initialize to 0 */
    if (!(array = calloc (NMAX, sizeof *array))) {
        fprintf (stderr, "error: memory allocation failed.");
        return 1;
    }

    /* read each line from file - separate into array       */
    while ((nchr = getline (&ln, &n, fp)) != -1)
    {
        char *p = ln;      /* pointer to ln read by getline */ 
        char *ep = NULL;   /* endpointer for strtol         */

        while (errno == 0)
        {   /* parse/convert each number in line into array */
            array[idx++] = xstrtol (p, &ep, base);

            if (idx == nmax)        /* check NMAX / realloc */
                array = realloc_long (array, &nmax);

            /* skip delimiters/move pointer to next digit   */
            while (*ep && *ep != '-' && (*ep < '0' || *ep > '9')) ep++;
            if (*ep)
                p = ep;
            else
                break;
        }
    }

    if (ln) free (ln);              /* free memory allocated by getline */
    if (fp) fclose (fp);            /* close open file descriptor       */

    int i = 0;
    for (i = 0; i < idx; i++)
        printf (" array[%d] : %ld\n", i, array[i]);

    free (array);

    return 0;
}

/* reallocate long pointer memory */
long *realloc_long (long *lp, unsigned long *n)
{
    long *tmp = realloc (lp, 2 * *n * sizeof *lp);
#ifdef DEBUG
    printf ("\n  reallocating %lu to %lu\n", *n, *n * 2);
#endif
    if (!tmp) {
        fprintf (stderr, "%s() error: reallocation failed.\n", __func__);
        // return NULL;
        exit (EXIT_FAILURE);
    }
    lp = tmp;
    memset (lp + *n, 0, *n * sizeof *lp); /* memset new ptrs 0 */
    *n *= 2;

    return lp;
}

long xstrtol (char *p, char **ep, int base)
{
    errno = 0;

    long tmp = strtol (p, ep, base);

    /* Check for various possible errors */
    if ((errno == ERANGE && (tmp == LONG_MIN || tmp == LONG_MAX)) ||
        (errno != 0 && tmp == 0)) {
        perror ("strtol");
        exit (EXIT_FAILURE);
    }

    if (*ep == p) {
        fprintf (stderr, "No digits were found\n");
        exit (EXIT_FAILURE);
    }

    return tmp;
}

输出样本(带-DDEBUG显示重新分配)

$ ./bin/read_long_csv dat/randlong.txt

  reallocating 64 to 128

  reallocating 128 to 256

  reallocating 256 to 512
 array[0] : 353
 array[1] : 394
 array[2] : 257
 array[3] : 173
 array[4] : 389
 array[5] : 332
 array[6] : 338
 array[7] : 293
 array[8] : 58
 array[9] : 135
<snip>
 array[395] : 146
 array[396] : 324
 array[397] : 424
 array[398] : 365
 array[399] : 205

内存错误检查

$ valgrind ./bin/read_long_csv dat/randlong.txt
==26142== Memcheck, a memory error detector
==26142== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==26142== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==26142== Command: ./bin/read_long_csv dat/randlong.txt
==26142==

  reallocating 64 to 128

  reallocating 128 to 256

  reallocating 256 to 512
 array[0] : 353
 array[1] : 394
 array[2] : 257
 array[3] : 173
 array[4] : 389
 array[5] : 332
 array[6] : 338
 array[7] : 293
 array[8] : 58
 array[9] : 135
<snip>
 array[395] : 146
 array[396] : 324
 array[397] : 424
 array[398] : 365
 array[399] : 205
==26142==
==26142== HEAP SUMMARY:
==26142==     in use at exit: 0 bytes in 0 blocks
==26142==   total heap usage: 7 allocs, 7 frees, 9,886 bytes allocated
==26142==
==26142== All heap blocks were freed -- no leaks are possible
==26142==
==26142== For counts of detected and suppressed errors, rerun with: -v
==26142== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

这篇关于从文本文件用C读数值流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆