如何从.txt文件中读取已知数量的未知大小的字符串,并将每一行存储在矩阵的一行中(用C表示)? [英] How can I read a known number of strings of unknown size from a .txt file and store each line in a line of a matrix (in C)?

查看:116
本文介绍了如何从.txt文件中读取已知数量的未知大小的字符串,并将每一行存储在矩阵的一行中(用C表示)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

标题很容易说明.我几乎可以确定最终结果不会是矩阵,因为每一行都有不同的列数,所以它更像是一个大小可变的数组的数组.按大小排序片段,也将是最大的,这也将很有趣.到目前为止,这是我尝试过的:

Title is pretty self explanatory. I'm almost sure that the end result wouldn't be a matrix as each line would have a different number of columns, so it's more like a array of arrays of variable sizes. It would also be interesting to sort the fragments by size, biggest first. This is what I've tried so far:

int main() {
  char str[MAXLEN], **fragmentsList;
  int number_of_strings, i, max, k;
  printf("Enter .txt file name: ");
  scanf("%s", str);
  printf("How many strings does the file has? ");
  scanf("%d", &number_of_strings);
  FILE *arq;
  arq = fopen(str, "r");
  for (i = 0, max = 0; !feof(arq); i++) {
    while (fscanf("%c") != '\n') {
      max++;
    }
    if (max > k) {
      k = max;
    }
  }
  fclose(arq);
  fragmentsList = malloc(k * sizeof(char));
  *fragmentsList = malloc(number_of_strings * sizeof(char));
  arq = fopen(str, "r");
  for (i = 0; !feof(arq); i++) {
    fscanf(arq, "%s", fragmentList[i]);
  }
  for (i = 0; i < number_of_strings; i++) {
    printf("%s", fragmentList[i]);
  }
  return 0;
}

推荐答案

从文件中读取未知数量的行到C语言的内存中是基本必需的.有两种方法可以解决此问题,但是标准做法是:

The reading an unknown number of lines from a file into memory in C is a basic necessity. There are a couple of way to approach it, but the standard practice is to:

  • 声明指向该类型的指针的指针(文件中各行的 char ** ),以便您在读取到内存后可以收集并引用每一行;

  • declare a pointer to pointer to type (char** for lines in a file) to allow you to collect and reference each line after read into memory;

首先分配一些合理预期的指针,以避免重复调用 realloc 为每行单独分配指针(最初分配 8、16、32,.. 都可以正常工作);

allocate some reasonably anticipated number of pointers to begin with to avoid repeated calls to realloc allocating pointers for each line individually (initially allocating 8, 16, 32, .. all work fine);

声明一个变量以跟踪读取的行数,并逐行递增;

declare a variable to track the number of lines read, and increment for each line;

将文件的每一行读入缓冲区(POSIX getline 的效果特别好,因为它本身会动态分配足够的存储空间来处理任何行长,从而使您免于固定读取缓冲区,并且必须分配和累积分行,直到到达行尾为止)

read each line of the file into a buffer (POSIX getline works particularly well because it itself will dynamically allocate sufficient storage to handle any line length -- freeing you from reading with a fixed buffer and having to allocate for and accumulate partial-lines until the end of the line is reached)

为每行分配存储,将行复制到新存储,然后将起始地址分配给下一个指针, strdup 为您完成这两个操作,但是由于它已分配,因此请确保您确认它成功;

allocate storage for each line, copy the line to the new storage, and assign the beginning address to your next pointer, strdup does both for you, but since it allocates, make sure you validate it succeeds;

当索引达到当前分配的指针数时, realloc 个指针(通常是将指针数加倍,或者将数字增加 3/2 --如果增加不是特别重要的速度-重要的是确保您始终有一个有效的指针来分配保存行的新内存块);和

when your index reaches your current number of allocated pointers, realloc more pointers (generally by doubling the number, or increasing the number by 3/2 -- the rate if increase isn't particularly important -- what is important is insuring you always have a valid pointer to assign the new block of memory holding your line to); and

重复直到完全读取文件.

repeat until the file is completely read.

重新分配内存时需要注意一些细微之处.首先绝对不要直接 realloc 直接指向要重新分配的指针,例如不要做:

There are a few subtleties to be aware of when reallocating memory. First never realloc directly to to pointer being reallocated, e.g. do not do:

mypointer = realloc (mypointer, current_size * 2);

如果 realloc 失败,则返回 NULL ;如果您将返回值分配给原始指针,则使用 NULL <覆盖当前数据的地址./code>造成内存泄漏.相反,在将新的内存块分配给原始指针之前,请始终使用临时指针并验证 realloc 是否成功,例如

if realloc fails, it returns NULL and if you are assigning the return to your original pointer, you overwrite the address to your current data with NULL creating a memory leak. Instead, always use a temporary pointer and validate realloc succeeds before assigning the new block of memory to your original pointer, e.g.

    if (filled_pointers == allocated pointers) {
        void *tmp = realloc (mypointer, current_size * 2);

        if (tmp == NULL) {
            perror ("realloc-mypointer");
            break;      /* or use goto to jump out of your read loop,
                         * preserving access to your current data in
                         * the original pointer.
                         */
        }
        mypointer = tmp;
        current_size *= 2;
    }

在使用 getline 的示例中将各个片段放在一起,您可以执行以下操作.(注意:代码希望文件名作为程序的第一个参数读取,如果没有给出参数,则默认情况下程序将从 stdin 读取)

Putting the pieces altogether in an example using getline, you can do something like the following. (note: the code expects the filename to read from as the 1st argument to your program, if no argument is given the program will read from stdin by default)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define NPTR 8      /* initial number of pointers (must be > 0) */

int main (int argc, char **argv) {

    size_t ndx = 0,             /* line index */
        nptrs = NPTR,           /* initial number of pointers */
        n = 0;                  /* line alloc size (0, getline decides) */
    ssize_t nchr = 0;           /* return (no. of chars read by getline) */
    char *line = NULL,          /* buffer to read each line */
        **lines = NULL;         /* pointer to pointer to each line */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;

    if (!fp) {  /* validate file open for reading */
        fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
        return 1;
    }

    /* allocate/validate initial 'nptrs' pointers */
    if (!(lines = calloc (nptrs, sizeof *lines))) {
        perror ("calloc - lines");
        return 1;
    }

    /* read each line with POSIX getline */
    while ((nchr = getline (&line, &n, fp)) != -1) {
        if (nchr && line[nchr - 1] == '\n') /* check trailing '\n' */
            line[--nchr] = 0;               /* overwrite with nul-char */
        char *buf = strdup (line);          /* allocate/copy line */
        if (!buf) {             /* strdup allocates, so validate */
            perror ("strdup-line");
            break;
        }
        lines[ndx++] = buf;     /* assign start address for buf to lines */
        if (ndx == nptrs) {     /* if pointer limit reached, realloc */
            /* always realloc to temporary pointer, to validate success */
            void *tmp = realloc (lines, sizeof *lines * nptrs * 2);
            if (!tmp) {         /* if realloc fails, bail with lines intact */
                perror ("realloc - lines");
                break;          /* don't exit, lines holds current lines */
            }
            lines = tmp;        /* assign reallocted block to lines */
            /* zero all new memory (optional) */
            memset (lines + nptrs, 0, nptrs * sizeof *lines);
            nptrs *= 2;         /* increment number of allocated pointers */
        }
    }
    free (line);                    /* free memory allocated by getline */

    if (fp != stdin) fclose (fp);   /* close file if not stdin */

    for (size_t i = 0; i < ndx; i++) {
        printf ("line[%3zu] : %s\n", i, lines[i]);
        free (lines[i]);            /* free memory for each line */
    }
    free (lines);                   /* free pointers */

    return 0;
}

仔细检查一下,如果还有其他问题,请告诉我.如果您没有 getline strdup 可用,请告诉我,我们很乐意进一步帮助实现将提供其行为的实现.

Look things over and let me know if you have further questions. If you do not have getline or strdup available, let me know and I'm happy to help further with an implementation that will provide their behavior.

这篇关于如何从.txt文件中读取已知数量的未知大小的字符串,并将每一行存储在矩阵的一行中(用C表示)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆