在c中返回可变长度字符串的最佳做法 [英] best practice for returning a variable length string in c

查看:184
本文介绍了在c中返回可变长度字符串的最佳做法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个字符串函数,它接受一个指向源字符串的指针并返回一个指向目标字符串的指针。此功能目前可行,但我担心我没有按照最佳做法重新调整malloc,realloc和free。



与我的函数不同的是目标字符串的长度与源字符串不同,所以必须在函数内部调用realloc()。我从看文档知道...

http: //www.cplusplus.com/reference/cstdlib/realloc/

在realloc之后内存地址可能会改变。这意味着我不能像C程序员那样通过引用传递其他函数,我必须返回新的指针。所以我的函数的原型是:

  //解码一个uri编码的字符串
char * net_uri_to_text(char *);

我不喜欢我这样做的方式,因为我必须在运行后释放指针函数:

  char * chr_output = net_uri_to_text(testing123%5a%5b%5cabc); 
printf(%s\,chr_output); // testing123Z [\ abc
free(chr_output);

这意味着malloc()和realloc()在我的函数内被调用,而free()被调用在我的函数外。



我有高级语言背景(perl,plpgsql,bash),所以我的直觉就是对这些东西的适当封装,但那可能不是在C中的最佳做法。



问题:我的方式是最佳实践,还是应该遵循更好的方法?



完整示例



编译并运行时,对未使用的argc和argv参数有两个警告,您可以放心地忽略这两个警告。 b

example.c:



  #include< stdio.h> 
#include< string.h>
#include< stdlib.h>

char * net_uri_to_text(char *);

int main(int argc,char ** argv){
char * chr_input =testing123%5a%5b%5cabc;
char * chr_output = net_uri_to_text(chr_input);
printf(%s\,chr_output);
free(chr_output);
返回0;
}

//解码uri编码的字符串
//发送指向源字符串的指针
//返回指向目标字符串的指针
//警告! !您必须使用免费(chr_result),否则您将获得内存泄露!
char * net_uri_to_text(char * chr_input){
//定义变量
int int_length = strlen(chr_input);
int int_new_length = int_length;
char * chr_output = malloc(int_length);
char * chr_output_working = chr_output;
char * chr_input_working = chr_input;
int int_output_working = 0;
unsigned int uint_hex_working;
// while not a null byte
while(* chr_input_working!='\0'){
// if%
if(* chr_input_working == *% ){
//然后在
sscanf中输入正确的字符(chr_input_working + 1,%02x,& uint_hex_working);
* chr_output_working =(char)uint_hex_working;
// printf(特殊字符:%c,%c,%d< \ n,* chr_output_working,(char)uint_hex_working,uint_hex_working);
// realloc
chr_input_working ++;
chr_input_working ++;
int_new_length - = 2;
chr_output = realloc(chr_output,int_new_length);
//输出工作必须是新指针plys我们已经完成了多少个字符
chr_output_working = chr_output + int_output_working;
} else {
//将字符存入
* chr_output_working = * chr_input_working;
}
//递增指针和输出工作中的字符数
chr_input_working ++;
chr_output_working ++;
int_output_working ++;
}
//最后一个空字节
* chr_output_working ='\0';
return chr_output;


解决方案

code> malloc 'd缓冲区来自C中的函数,只要你记录它们的事实。许多库都这样做,即使标准库中没有函数。



如果您可以计算(不太悲观的上限)需要的字符数以便宜的方式写入缓冲区,你可以提供一个这样的功能,让用户调用它。



这也可能,但不太方便,接受要填写的缓冲区;我已经看到了很多类似这样的库:

  / * 
解码uri编码的字符串编码成长度为len的buf(包括NUL)。
*返回写入的字符数。如果该数字小于len,则
*没有任何内容被写入,您应该再次尝试使用更大的缓冲区。
* /
size_t net_uri_to_text(char const * encoded,char * buf,size_t len)
{
size_t space_needed = 0;

while(decode_needs_to_be_done()){
//解码字符,但只写入buf
//如果不会溢出;
//增加space_needed而不管
}
返回space_needed;

$ / code>

现在调用者负责分配,并执行类似于 p>

  size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH; 
char * result = xmalloc(len);

len = net_uri_to_text(input,result,len);
if(len> SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH){
//再次尝试
result = xrealloc(input,result,len);

(这里, xmalloc xrealloc 是安全的分配函数,这样我就可以省略NULL检查。)


I have a string function that accepts a pointer to a source string and returns a pointer to a destination string. This function currently works, but I'm worried I'm not following the best practice regrading malloc, realloc, and free.

The thing that's different about my function is that the length of the destination string is not the same as the source string, so realloc() has to be called inside my function. I know from looking at the docs...

http://www.cplusplus.com/reference/cstdlib/realloc/

that the memory address might change after the realloc. This means I have can't "pass by reference" like a C programmer might for other functions, I have to return the new pointer.

So the prototype for my function is:

//decode a uri encoded string
char *net_uri_to_text(char *);

I don't like the way I'm doing it because I have to free the pointer after running the function:

char * chr_output = net_uri_to_text("testing123%5a%5b%5cabc");
printf("%s\n", chr_output); //testing123Z[\abc
free(chr_output);

Which means that malloc() and realloc() are called inside my function and free() is called outside my function.

I have a background in high level languages, (perl, plpgsql, bash) so my instinct is proper encapsulation of such things, but that might not be the best practice in C.

The question: Is my way best practice, or is there a better way I should follow?

full example

Compiles and runs with two warnings on unused argc and argv arguments, you can safely ignore those two warnings.

example.c:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char *net_uri_to_text(char *);

int main(int argc, char ** argv) {
  char * chr_input = "testing123%5a%5b%5cabc";
  char * chr_output = net_uri_to_text(chr_input);
  printf("%s\n", chr_output);
  free(chr_output);
  return 0;
}

//decodes uri-encoded string
//send pointer to source string
//return pointer to destination string
//WARNING!! YOU MUST USE free(chr_result) AFTER YOU'RE DONE WITH IT OR YOU WILL GET A MEMORY LEAK!
char *net_uri_to_text(char * chr_input) {
  //define variables
  int int_length = strlen(chr_input);
  int int_new_length = int_length;
  char * chr_output = malloc(int_length);
  char * chr_output_working = chr_output;
  char * chr_input_working = chr_input;
  int int_output_working = 0;
  unsigned int uint_hex_working;
  //while not a null byte
  while(*chr_input_working != '\0') {
    //if %
    if (*chr_input_working == *"%") {
      //then put correct char in
      sscanf(chr_input_working + 1, "%02x", &uint_hex_working);
      *chr_output_working = (char)uint_hex_working;
      //printf("special char:%c, %c, %d<\n", *chr_output_working, (char)uint_hex_working, uint_hex_working);
      //realloc
      chr_input_working++;
      chr_input_working++;
      int_new_length -= 2;
      chr_output = realloc(chr_output, int_new_length);
      //output working must be the new pointer plys how many chars we've done
      chr_output_working = chr_output + int_output_working;
    } else {
      //put char in
      *chr_output_working = *chr_input_working;
    }
    //increment pointers and number of chars in output working
    chr_input_working++;
    chr_output_working++;
    int_output_working++;
  }
  //last null byte
  *chr_output_working = '\0';
  return chr_output;
}

解决方案

It's perfectly ok to return malloc'd buffers from functions in C, as long as you document the fact that they do. Lots of libraries do that, even though no function in the standard library does.

If you can compute (a not too pessimistic upper bound on) the number of characters that need to be written to the buffer cheaply, you can offer a function that does that and let the user call it.

It's also possible, but much less convenient, to accept a buffer to be filled in; I've seen quite a few libraries that do that like so:

/*
 * Decodes uri-encoded string encoded into buf of length len (including NUL).
 * Returns the number of characters written. If that number is less than len,
 * nothing is written and you should try again with a larger buffer.
 */
size_t net_uri_to_text(char const *encoded, char *buf, size_t len)
{
    size_t space_needed = 0;

    while (decoding_needs_to_be_done()) {
        // decode characters, but only write them to buf
        // if it wouldn't overflow;
        // increment space_needed regardless
    }
    return space_needed;
}

Now the caller is responsible for the allocation, and would do something like

size_t len = SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH;
char *result = xmalloc(len);

len = net_uri_to_text(input, result, len);
if (len > SOME_VALUE_THAT_IS_USUALLY_LONG_ENOUGH) {
    // try again
    result = xrealloc(input, result, len);
}

(Here, xmalloc and xrealloc are "safe" allocating functions that I made up to skip NULL checks.)

这篇关于在c中返回可变长度字符串的最佳做法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆