在C未定义长度字符串读 [英] Reading Strings with Undefined Length in C

查看:91
本文介绍了在C未定义长度字符串读的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

第一(总是)我想了解我的英语道歉,也未必清楚。

我没那么擅长C语言编程,我被要求阅读字符串输入未定义长度。

这是我的解决方案

 的#include<&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括LT&;&string.h中GT;字符* newChar();
字符* addChar(字符*,字符);
字符*的readLine(无效);诠释主(){
  字符*言论报;
  言论报= newChar();  言论报=的readLine();
  的printf(言论报=%S \\ n言论报);  返回0;
}字符* newChar(){
  字符*名单=(字符*)malloc的(0 *的sizeof(字符));
  *列表='\\ 0';
  返回列表;
}字符* addChar(字符* LST,字符NUM){
  INT拉戈= strlen的(LST)+ 1;
  的realloc(安培; LST,慢板*的sizeof(字符));
  *(LST +(慢板 - 1))= NUM​​;
  *(LST +拉哥)='\\ 0';
  返回善堂;
}字符*的readLine(){
  焦炭℃;
  字符*言论报= newChar();  C =的getchar();
  而(C!='\\ n'){
    如果(C!='\\ n'){
      言论报= addChar(言论报,C);
    }
    C =的getchar();
  }
  返回言论报;
}

请,我AP preciate你帮助我,告诉我,如果这是一个好主意,或者给我一些其他的想法(也告诉我,如果它是指向一个正确的使用)。

在此先感谢


编辑:好了,谢谢你的回答,他们是非常有用的。现在我张贴编辑(我希望更好)code,也许可能是有人新C(像我)有用,再反馈。

 的#include<&stdio.h中GT;
#包括LT&;&stdlib.h中GT;
#包括LT&;&string.h中GT;
无效可充(字符**,为int *);
无效的readLine(字符**,为int *);诠释主(){
    字符*言论报= NULL;
    INT拉戈= 0;    可充(放;言论报,&安培;慢板);
    的readLine(安培;言论报,&安培;慢板);
    的printf(言论报=%S \\ n言论报,慢板);    系统(暂停);
    返回0;
}无效可充(字符** LST,为int *慢板){
    (*拉哥)+ = 4;
    字符* TEMP =(的char *)realloc的(* LST(*慢板)* sizeof的(炭));    如果(温度!= NULL){
        * LST =温度;
    }其他{
        免费(* LST);
        看跌期权(错误(重新)分配内存);
        出口(1);
    }
}无效的readLine(字符** LST,为int *慢板){
    INT℃;
    INT POS = 0;    C =的getchar();
    而(C ='\\ n'和;!&安培;!C = EOF){
        如果((POS + 1)%4 == 0){
            可充(LST,慢板);
        }
        (* LST)[POS] =(焦炭)C;
        POS ++;
        C =的getchar();
    }
    (* LST)[POS] ='\\ 0';
}

PS:


  • 这似乎够慢的言论报增加大小。


  • 我不知道,如果捕获的getchar() INT ,然后丢在一个字符是hadle 的正确方法EOF陷阱



解决方案

  1. 查找POSIX 函数getline的定义()


  2. 记住,你需要捕获从的返回值的realloc();它不能保证新存储器块开始于相同的位置旧


  3. 要知道,的malloc(0)可能会返回一个空指针,或者它可能会返回一个非空指针是不可用的(因为它指向零字节内存不足)。


  4. 您可以不写 *列表='\\ 0'; 时,列表指向零字节分配内存;你没有权限写在那里。如果你得到一个NULL回来,你很可能得到一个核心转储。在任何情况下,您正在调用未定义的行为,这是'一个坏主意™'。 (感谢


  5. 言论报= newChar(); 的main()泄漏内存 - 假设你解决其他的问题已经讨论过。


  6. 在code在的readLine()不考虑让一个新行前得到EOF的可能性;这是不好的,并会导致核心转储时,内存分配(终于)失败。


  7. ,因为它一次分配一个字符您code将展出表现不佳。通常情况下,你应该分配比一次一个额外的字符相当多的;开始也许4个字节的初始分配,并在每次需要更多的空间可能是更好的时间加倍分配。保持初始分配小,使再分配code正确的测试。


  8. 的返回值的getchar() INT ,而不是字符。在大多数机器上,它可以返回256个不同的正面人物值(即使字符是一个符号类型)和一个单独的值,EOF,也就是从所有的<$ C不同$ C>字符值。 (该标准允许它返回超过256个不同的字符,如果机器有每个都大于8位字节)。(感谢)C99标准§7.19.7.1说<$ C $的c>龟etc():


      

    如果对于输入流的结束的文件指示符指向流没有设置和
      下一个字符是present,在龟etc函数获取的字符为无符号
      字符转换为int
    并推进了相关的文件位置指示器
      流(如果定义)。


    (着重号)。它定义的getchar() GETC()而言,它定义 GETC()龟etc而言()


  9. (借用:感谢)。第一个参数的realloc()是指向当前分配内存的开始,而不是一个指针指向当前分配内存的开始。如果你没有从中得到一个编译警告,你是不是跟你的编译器设置足够的警告编译。你应该警告调高到最大。你应该听从编译器的警告 - 他们通常表示在code错误的,特别是当你还在学习的语言


  10. 这往往更容易保持字符串不包含空终止,直到你知道你已经到达了行(或输入端)的结束。当没有更多的字符要读取(暂且),然后附加空以便返回前它的字符串是正确终止。这些功能不需要串,而他们正在阅读正确地终止,只要你保持跟踪你的字符串在的。一定要确保你有足够的空间在任何时候添加NUL '\\ 0'到字符串的结尾,虽然。


请参阅Kernighan的&安培;派克 了很多相关的讨论编程实践。我也觉得马奎尔'写作固体code'有相关的咨询提供的,用于所有这是有些过时。然而,你应该知道,还有那些谁苛责的书。因此,我建议在TPOP WSC(但亚马逊已经从$ 0.01 + P和WSC; P,而在TPOP $ 20.00 +开始P&付款 - 这可能是市场而言)



TPOP是previously在
http://plan9.bell-labs.com/cm/cs/tpop
http://cm.bell-labs.com/cm/cs/tpop 但两者现在(2015年8月10日)打破。
另请参阅 TPOP 维基百科。

first(as always) I want to apologize about my english, it may not be clear enough.

I'm not that good at C programming, and I was asked to read a "string" input with undefined length.

This is my solution

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *newChar();
char *addChar(char *, char);
char *readLine(void);

int main() {
  char *palabra;
  palabra = newChar();

  palabra = readLine();
  printf("palabra=%s\n", palabra);

  return 0;
}

char *newChar() {
  char *list = (char *) malloc(0 * sizeof (char));
  *list = '\0';
  return list;
}

char *addChar(char *lst, char num) {
  int largo = strlen(lst) + 1;
  realloc(&lst, largo * sizeof (char));
  *(lst + (largo - 1)) = num;
  *(lst + largo) = '\0';
  return lst;
}

char *readLine() {
  char c;
  char *palabra = newChar();

  c = getchar();
  while (c != '\n') {
    if (c != '\n') {
      palabra = addChar(palabra, c);
    }
    c = getchar();
  }
  return palabra;
}

Please, I'd appreciate that you help me by telling me if it's a good idea or giving me some other idea(and also telling me if it's a "correct" use for pointers).

Thanks in advance


EDIT: Well, thanks for you answers,they were very useful. Now I post edited(and I hope better) code, maybe could be useful for someone new to C(like me) and be feedbacked again.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


void reChar(char **, int *);
void readLine(char **, int *);

int main() {
    char *palabra = NULL;
    int largo = 0;

    reChar(&palabra, &largo);
    readLine(&palabra, &largo);
    printf("palabra=%s\n", palabra, largo);

    system("pause");
    return 0;
}

void reChar(char **lst, int *largo) {
    (*largo) += 4;
    char *temp = (char*) realloc(*lst, (*largo) * sizeof (char));

    if (temp != NULL) {
        *lst = temp;
    } else {
        free(*lst);
        puts("error (re)allocating memory");
        exit(1);
    }
}

void readLine(char **lst, int *largo) {
    int c;
    int pos = 0;

    c = getchar();
    while (c != '\n' && c != EOF) {
        if ((pos + 1) % 4 == 0) {
            reChar(lst, largo);
        }
        (*lst)[pos] =(char) c;
        pos++;
        c = getchar();
    }
    (*lst)[pos] = '\0';
}

PS:

  • It seem enough to slow increase size of "palabra".

  • I'm not sure if capture getchar() into a int and then cast it into a char is the correct way to hadle EOF pitfall

解决方案

  1. Look up the definition of POSIX getline().

  2. Remember that you need to capture the return value from realloc(); it is not guaranteed that the new memory block starts at the same position as the old one.

  3. Know that malloc(0) may return a null pointer, or it may return a non-null pointer that is unusable (because it points to zero bytes of memory).

  4. You may not write '*list = '\0'; when list points to zero bytes of allocated memory; you don't have permission to write there. If you get a NULL back, you are likely to get a core dump. In any case, you are invoking undefined behaviour, which is 'A Bad Idea™'. (Thanks)

  5. The palabra = newChar(); in main() leaks memory - assuming that you fix the other problems already discussed.

  6. The code in readLine() doesn't consider the possibility of getting EOF before getting a newline; that is bad and will result in a core dump when memory allocation (finally) fails.

  7. Your code will exhibit poor performance because it allocates one character at a time. Typically, you should allocate considerably more than one extra character at a time; starting with an initial allocation of perhaps 4 bytes and doubling the allocation each time you need more space might be better. Keep the initial allocation small so that the reallocation code is properly tested.

  8. The return value from getchar() is an int, not a char. On most machines, it can return 256 different positive character values (even if char is a signed type) and a separate value, EOF, that is distinct from all the char values. (The standard allows it to return more than 256 different characters if the machine has bytes that are bigger than 8 bits each.) (Thanks) The C99 standard §7.19.7.1 says of fgetc():

    If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

    (Emphasis added.) It defines getchar() in terms of getc(), and it defines getc() in terms of fgetc().

  9. (Borrowed: Thanks). The first argument to realloc() is the pointer to the start of the currently allocated memory, not a pointer to the pointer to the start of the currently allocated memory. If you didn't get a compilation warning from it, you are not compiling with enough warnings set on your compiler. You should turn up the warnings to the maximum. You should heed the compiler warnings - they are normally indicative of bugs in your code, especially while you are still learning the language.

  10. It is often easier to keep the string without a null terminator until you know you have reached the end of the line (or end of input). When there are no more characters to be read (for the time being), then append the null so that the string is properly terminated before it is returned. These functions do not need the string properly terminate while they are reading, as long as you keep track of where you are in the string. Do make sure you have enough room at all times to add the NUL '\0' to the end of the string, though.

See Kernighan & Pike 'The Practice of Programming' for a lot of relevant discussions. I also think Maguire 'Writing Solid Code' has relevant advice to offer, for all it is somewhat dated. However, you should be aware that there are those who excoriate the book. Consequently, I recommend TPOP over WSC (but Amazon has WSC available from $0.01 + p&p, whereas TPOP starts at $20.00 + p&p -- this may be the market speaking).


TPOP was previously at http://plan9.bell-labs.com/cm/cs/tpop and http://cm.bell-labs.com/cm/cs/tpop but both are now (2015-08-10) broken. See also Wikipedia on TPOP.

这篇关于在C未定义长度字符串读的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆