什么是正确的方式来处理char *字符串? [英] What is the right way to handle char* strings?

查看:112
本文介绍了什么是正确的方式来处理char *字符串?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个第三方库,它使用char *(非const)作为字符串值的占位符。为这些数据类型指定值的正确而安全的方法是什么?我有以下测试基准,使用我自己的计时器类来测量执行时间:

  #includestring.h
#include< iostream>
#include< sj / timer_chrono.hpp>

using namespace std;

int main()
{
sj :: timer_chrono sw;

int iterations = 1e7;

//第一个方法给出编译器警告:
//不再支持从字符串字面到'char *'的转换[-Wdeprecated-writable-strings]
cout< 创建c字符串unsafe(?)方式...< endl
sw.start();
for(int i = 0; i< iterations; ++ i)
{
char * str =teststring;
}
sw.stop()
cout<< sw.elapsed_ns()/(double)iterations<< ns< endl

cout<< 创建c字符串安全(?)方式...< endl
sw.start();
for(int i = 0; i< iterations; ++ i)
{
char * str = new char [strlen(teststr)];
strcpy(str,teststring);
}
sw.stop();
cout<< sw.elapsed_ns()/(double)iterations<< ns< endl


return 0;

}

输出:

 创建c字符串unsafe(?)方式... 
1.9164 ns
创建c字符串安全$ b 31.7406 ns

虽然安全的方式摆脱了编译器警告,根据这个基准(每次迭代1.9纳秒,每次迭代31.7纳秒)慢15-20倍。

解决方案

C ++标准很清楚:

/ p>

一个普通的字符串文字具有类型array of n const char(C ++ 11中的2.14.5.8节)。





尝试修改字符串文字的效果未定义(C ++ 11中的第2.14.5.12节)。



对于编译时已知的字符串,获取非const char * 的安全方法是

  char literal [] =teststring; 

您可以安全地

  char * ptr = literal; 

如果在编译时你不知道字符串但知道它的长度,

  char str [STR_LENGTH + 1]; 

如果你不知道长度,那么你需要使用动态分配。确保在字符串不再需要时释放内存。



只有在API不拥有 char



如果它尝试在内部释放字符串,那么它应该在文档中说明,并通知您正确的方式分配字符串。您需要将您的分配方法与API内部使用的方法相匹配。



  char literal [] =test 

将创建一个包含自动存储的本地5字符数组(意味着当执行离开变量被声明的范围),并用字符't','e','s','t'和'\0'初始化数组中的每个字符。



您可以稍后编辑这些字符: literal [2] ='x';



如果你这样写:

  char * str1 =test 
char * str2 =test;根据编译器, str1



<和 str2 可以是相同的值(即指向同一个字符串)。



(C ++标准的2.14.5.12节中,是否所有字符串文字都是不同的(即存储在非重叠对象中)是实现定义的)

它也可能是真的,它们存储在只读存储器的一部分,因此任何尝试修改字符串将导致异常/崩溃。



他们实际上也是 const char * 类型,所以这行:



char * str =test ;



实际上丢弃了字符串的常量,这就是编译器发出警告的原因。


I have a third party library that is using char* (non-const) as placeholder for string values. What is the right and safe way to assign values to those datatypes? I have the following test benchmark that uses my own timer class to measure execution times:

#include "string.h"
#include <iostream>
#include <sj/timer_chrono.hpp>

using namespace std;

int main()
{
    sj::timer_chrono sw;

    int iterations = 1e7;

    // first method gives compiler warning:
    // conversion from string literal to 'char *' is deprecated [-Wdeprecated-writable-strings]
    cout << "creating c-strings unsafe(?) way..." << endl;
    sw.start();
    for (int i = 0; i < iterations; ++i)
    {
        char* str = "teststring";
    }   
    sw.stop();
    cout << sw.elapsed_ns() / (double)iterations << " ns" << endl;

    cout << "creating c-strings safe(?) way..." << endl;
    sw.start();
    for (int i = 0; i < iterations; ++i)
    {
        char* str = new char[strlen("teststr")];
        strcpy(str, "teststring");
    }   
    sw.stop();
    cout << sw.elapsed_ns() / (double)iterations << " ns" << endl;


    return 0;

}

Output:

creating c-strings unsafe(?) way...
1.9164 ns
creating c-strings safe(?) way...
31.7406 ns

While the "safe" way get's rid of the compiler warning it makes the code about 15-20 times slower according to this benchmark (1.9 nanoseconds per iteration vs 31.7 nanoseconds per iteration). What is the correct way and what are is so dangerous about that "deprecated" way?

解决方案

The C++ standard is clear:

An ordinary string literal has type "array of n const char" (section 2.14.5.8 in C++11).

and

The effect of attempting to modify a string literal is undefined (section 2.14.5.12 in C++11).

For a string known at compile time, the safe way of obtaining a non-const char* is this

char literal[] = "teststring";

you can then safely

char* ptr = literal;

If at compile time you don't know the string but know its length you can use an array:

char str[STR_LENGTH + 1];

If you don't know the length then you will need to use dynamic allocation. Make sure you deallocate the memory when the strings are no longer needed.

This will work only if the API doesn't take ownership of the char* you pass.

If it tries to deallocate the strings internally then it should say so in the documentation and inform you on the proper way to allocate the strings. You will need to match your allocation method with the one used internally by the API.

The

char literal[] = "test";

will create a local, 5 character array with automatinc storage (meaning the variable will be destroyed when the execution leaves the scope in which the variable is declared) and initialize each character in the array with the characters 't', 'e', 's', 't' and '\0'.

You can later edit these characters: literal[2] = 'x';

If you write this:

char* str1 = "test";
char* str2 = "test";

then, depending on the compiler, str1 and str2 may be the same value (i.e., point to the same string).

("Whether all string literals are distinct (that is, are stored in nonoverlapping objects) is implementation- defined." in Section 2.14.5.12 of the C++ standard)

It may also be true that they are stored in a read-only section of memory and therefore any attempt to modify the string will result in an exception/crash.

They are also, in reality of the type const char* so this line:

char* str = "test";

actually casts away the const-ness on the string, which is why the compiler will issue the warning.

这篇关于什么是正确的方式来处理char *字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆