从标准输入读取长度未知的行数 [英] Reading an unknown number of lines with unknown length from stdin

查看:186
本文介绍了从标准输入读取长度未知的行数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对使用C语言进行编程比较陌生,并且正在尝试使用 fgets stdin 读取输入.

I'm relatively new to programming in C and am trying to read input from stdin using fgets.

首先,我考虑阅读最多50行,每行最多50个字符,并具有以下内容:

To begin with I thought about reading max 50 lines, max 50 characters each, and had something like:

int max_length = 50;
char lines[max_length][max_length];
char current_line[max_length];
int idx = 0;

while(fgets(current_line, max_length, stdin) != NULL) {
    strcopy(lines[idx], current_line);
    idx++;
}

上面的代码段成功读取了输入并将其存储到lines数组中,在这里我可以对其进行排序和打印.

The snippet above successfully reads the input and stores it into the lines array where I can sort and print it.

我的问题是如何处理行数未知,每行字符数未知的行? (请记住,我将不得不对行进行排序并打印出来).

My question is how do I deal with an unknown number of lines, with an unknown number of characters on each line? (bearing in mind that I will have to sort the lines and print them out).

推荐答案

虽然已经回答了许多不同的问题,但是有关如何解决该问题的注意事项可以使用一个段落.遇到此问题时,无论使用哪种库或POSIX函数的组合,方法都是相同的.

While there are a number of different variations of this problem already answered, the considerations of how to go about it could use a paragraph. When faced with this problem, the approach is the same regardless of which combination of library or POSIX functions you use to do it.

本质上,您将动态分配合理数量的字符以容纳每一行. POSIX getline将自动为您执行此操作,使用fgets您可以简单地读取一个充满字符的固定缓冲区并将其追加(根据需要重新分配存储空间),直到读取'\n'字符(或到达EOF)为止.

Essentially, you will dynamically allocate a reasonable number of characters to hold each line. POSIX getline will do this for you automatically, using fgets you can simply read a fixed buffer full of chars and append them (reallocating storage as necessary) until the '\n' character is read (or EOF is reached)

如果使用getline,则必须为其分配内存,并复制已填充的缓冲区.否则,您将在每次读取新行时覆盖以前的行,并且当您尝试每行free时,您可能会在 double-free或ruption 中出现SegFault,因为您反复尝试释放同一块的内存.

If you use getline, then you must allocate memory for, and copy the buffer filled. Otherwise, you will overwrite previous lines with each new line read, and when you attempt to free each line, you will likely SegFault with double-free or corruption as you repeatedly attempt to free the same block of memory.

您可以使用strdup来简单地复制缓冲区.但是,由于strdup分配了存储空间,因此在分配指向行集合的新内存块的指针之前,您应该验证分配是否成功.

You can use strdup to simply copy the buffer. However, since strdup allocates storage, you should validate successful allocation before assigned a pointer to the new block of memory to your collection of lines.

要访问每一行,您需要一个指向每一行的开头的指针(保存每一行的内存块).通常使用指向char的指针. (例如char **lines;)内存分配通常通过分配一些合理数量的指针开始,跟踪使用的数量,并在达到分配的数量后,将realloc并加倍指针数量

To access each line, you need a pointer to the beginning of each (the block of memory holding each line). A pointer to pointer to char is generally used. (e.g. char **lines;) Memory allocation is generally handled by allocating some reasonable number of pointers to begin with, keeping track of the number you use, and when you reach the number you have allocated, you realloc and double the number of pointers.

与每次读取一样,您需要 验证 每个内存分配. (每个malloccallocrealloc),您还需要通过内存错误检查程序(例如,对于Linux,例如valgrind)运行该程序,以验证程序使用分配的内存的方式.它们很容易使用,只需valgrind yourexename.

As with each read, you need to validate each memory allocation. (each malloc, calloc, or realloc) You also need to validate the way your program uses the memory you allocate by running the program through a memory error check program (such as valgrind for Linux). They are simple to use, just valgrind yourexename.

将这些部分放在一起,您可以执行以下操作.以下代码将从作为程序的第一个参数提供的文件名中读取所有行(如果没有提供参数,则默认情况下从stdin读取)并将行号和行打印到stdout(请注意,在50,000行文件上运行它)

Putting those pieces together, you can do something similar to the following. The following code will read all lines from the filename provided as the first argument to the program (or from stdin by default if no argument is provided) and print the line number and line to stdout (keep that in mind if you run it on a 50,000 line file)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NPTR 8

int main (int argc, char **argv) {

    size_t ndx = 0,             /* line index */
        nptrs = NPTR,           /* initial number of pointers */
        n = 0;                  /* line alloc size (0, getline decides) */
    ssize_t nchr = 0;           /* return (no. of chars read by getline) */
    char *line = NULL,          /* buffer to read each line */
        **lines = NULL;         /* pointer to pointer to each line */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    /* allocate/validate initial 'nptrs' pointers */
    if (!(lines = calloc (nptrs, sizeof *lines))) {
        fprintf (stderr, "error: memory exhausted - lines.\n");
        return 1;
    }

    /* read each line with POSIX getline */
    while ((nchr = getline (&line, &n, fp)) != -1) {
        if (nchr && line[nchr - 1] == '\n') /* check trailing '\n' */
            line[--nchr] = 0;               /* overwrite with nul-char */
        char *buf = strdup (line);          /* allocate/copy line */
        if (!buf) {     /* strdup allocates, so validate */
            fprintf (stderr, "error: strdup allocation failed.\n");
            break;
        }
        lines[ndx++] = buf;     /* assign start address for buf to lines */
        if (ndx == nptrs) {     /* if pointer limit reached, realloc */
            /* always realloc to temporary pointer, to validate success */
            void *tmp = realloc (lines, sizeof *lines * nptrs * 2);
            if (!tmp) {         /* if realloc fails, bail with lines intact */
                fprintf (stderr, "read_input: memory exhausted - realloc.\n");
                break;
            }
            lines = tmp;        /* assign reallocted block to lines */
            /* zero all new memory (optional) */
            memset (lines + nptrs, 0, nptrs * sizeof *lines);
            nptrs *= 2;         /* increment number of allocated pointers */
        }
    }
    free (line);                    /* free memory allocated by getline */

    if (fp != stdin) fclose (fp);   /* close file if not stdin */

    for (size_t i = 0; i < ndx; i++) {
        printf ("line[%3zu] : %s\n", i, lines[i]);
        free (lines[i]);            /* free memory for each line */
    }
    free (lines);                   /* free pointers */

    return 0;
}

如果没有getlinestrdup,则可以轻松实现它们.网站上每个都有多个示例.如果找不到,请告诉我.如果您还有其他问题,也请让我知道.

If you don't have getline, or strdup, you can easily implement each. There are multiple examples of each on the site. If you cannot find one, let me know. If you have further questions, let me know as well.

这篇关于从标准输入读取长度未知的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆