为什么不允许std :: string从字符串数组初始化? [英] Why not allowing std::string initialization from array of chars?

查看:793
本文介绍了为什么不允许std :: string从字符串数组初始化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在C ++中,您可以从 char * 和<$ c $初始化 std :: string c> const char * ,隐含地假定字符串将首先以 NUL 字符结束。



在C ++中,字符串文字是数组,而且即使字符串文字包含嵌入的 NUL s,也可以使用模板构造函数来获取正确的大小。参见例如下面的玩具实现:

  #include< stdio.h> 
#include< string.h>
#include< vector>
#include< string>

struct String {
std :: vector< char>数据;
int size()const {return data.size(); }

template< typename T> String(const T s);

// Hack:数组也可能包含结尾NUL
//我们不想要...
template< int N> String(const char(& s)[N])
:data(s,s + N-(N> 0& s [N-1] =='\0' }

//非const数组被移除为可能有很多代码
//将字符串构建到char数组中,并将
//隐式转换为字符串对象。
// template< int N> String(char(& s)[N]):data(s,s + N){}
};

//(一个棘手的部分是,你不能只声明一个构造函数
//接受一个`const char *`,因为这将赢得模板
//构造函数。 ..这里我让那个构造函数一个模板,但我是
//没有模板编程大师,可能有更好的方法)。
模板<> String :: String(const char * s):data(s,s + strlen(s)){}

int main(int argc,const char * argv []){
String s1 =Hello\0world\\\
;
printf(Length s1 - >%i\\\
,s1.size());
const char * s2 =Hello\0world\\\
;
printf(Length s2 - >%i\\\
,String(s2).size());
std :: string s3 =Hello\0world\\\
;
printf(std :: string size =%i\\\
,int(s3.size()));
return 0;
}

是否有任何特定的技术原因,标准,而是用于嵌入 NUL 的字符串文字在用于初始化 std :: string 对象时最终被截断?

解决方案

使用包含文字的文字初始化 std :: string 嵌入的nullbytes需要将起始指针和长度传递给构造函数。



这是最简单的,如果有一个专门的take-array-reference构造函数模板, / p>


  • 这样一个只有数组参数的模板将被视为比构造函数只需要简单地 char const *


  • 就不清楚是否应该包括最终终止空值




第一点意味着物理代码接口将是一个模板化的构造函数, (而不是你的编辑器的工具提示为例)将讲述它是否接受的完整的故事。一个修复是引入一个额外的虚拟解析器参数。这降低了方便性。



第二点是引入错误的机会。构造函数的最常见的用法无疑是普通的字符串字面量。然后,现在,然后,它将用于字面量和/或数组嵌入的nullbytes,但好奇地,最后一个字符chopped off。



而是可以简单的名字值

  char const data [] =* .com\0 * .exe\0 * .bat\\ \\ 0 * .cmd\0; 
string s(data,data + sizeof(data)); //在末尾包含2个空值。

总之,当我定义了自己的字符串类时, - 求值构造函数,但是因为不同于方便的原因。也就是说,在文字的情况下,字符串对象可以简单地保持该指针,而没有复制,这不仅提供了效率,而且提供了例如图像的安全性(正确性)。异常。并且 const char 的数组是我们在C ++ 11和更高版本中最清楚的字面值。



但是, std :: string 不能这样做:它不是为它设计的。






如果这经常做,那么可以定义一个这样的函数:

  = ptrdiff_t; 

template<尺寸n>
auto string_from_data(char const(& data)[n])
- > std :: string
{return std :: string(data,data + n); }

然后可以写

  string const s = string_from_data(* .com\0 * .exe\0 * .bat\0 * .cmd\0); 

免责声明:没有编译器触及或看到的代码。






[我在第一次写作时遗漏了这个,但被 Hurkyl的回答。现在正在进行咖啡!]



一个C ++ 14字符串类型的文字剔除了最后的 \0 所以使用这样的文字,上面的代码必须包括显式地终止nullvalue:

  string const s =* .com\0 * .exe\0 * .bat\0 * .cmd\0\0; 

除此之外,C ++ 14字符串类型文字似乎提供了方便。 p>

In C++ you can initialize an std::string object from a char * and a const char * and this implicitly assumes that the string will end at first NUL character found after the pointer.

In C++ string literals are however arrays and a template constructor could be used to get the correct size even if the string literal contains embedded NULs. See for example the following toy implementation:

#include <stdio.h>
#include <string.h>
#include <vector>
#include <string>

struct String {
    std::vector<char> data;
    int size() const { return data.size(); }

    template<typename T> String(const T s);

    // Hack: the array will also possibly contain an ending NUL
    // we don't want...
    template<int N> String(const char (&s)[N])
        : data(s, s+N-(N>0 && s[N-1]=='\0')) {}

    // The non-const array removed as probably a lot of code
    // builds strings into char arrays and the convert them
    // implicitly to string objects.
    //template<int N> String(char (&s)[N]) : data(s, s+N) {}
};

// (one tricky part is that you cannot just declare a constructor
// accepting a `const char *` because that would win over the template
// constructor... here I made that constructor a template too but I'm
// no template programming guru and may be there are better ways).
template<> String::String(const char *s) : data(s, s+strlen(s)) {}

int main(int argc, const char *argv[]) {
    String s1 = "Hello\0world\n";
    printf("Length s1 -> %i\n", s1.size());
    const char *s2 = "Hello\0world\n";
    printf("Length s2 -> %i\n", String(s2).size());
    std::string s3 = "Hello\0world\n";
    printf("std::string size = %i\n", int(s3.size()));
    return 0;
}

Is there any specific technical reason for which this approach wasn't considered for the standard and instead a string literal with embedded NULs ends up being truncated when used to initialize an std::string object?

解决方案

Initializing a std::string with a literal that contains embedded nullbytes requires passing both the starting pointer and the length to a constructor.

That's easiest if there is a dedicated takes-array-reference constructor template, but as you note

  • such a template, with only the array argument, would be considered a worse match than the constructor taking simply char const*, and

  • it would be unclear whether a final terminating nullvalue should be included or not.

The first point means that the physical code interface would be a single templated constructor, where only the documentation (and not your editor's tooltip for example) would tell the full story about what it acccepted or not. One fix is to introduce an additional dummy resolver argument. That reduces convenience.

The second point is an opportunity for introducing bugs. The most common use of the constructor would no doubt be ordinary string literals. Then, now and then, it would be used for literals and/or arrays with embedded nullbytes, but curiously with the last character choppped off.

Instead one can simply first name the value,

char const data[] = "*.com\0*.exe\0*.bat\0*.cmd\0";
string s( data, data + sizeof( data ) );    // Including 2 nulls at end.

All that said, when I've defined my own string classes I've included the takes-array-argument constructor, but for a very different reason than convenience. Namely, that in the case of a literal the string object can simply hold on to that pointer, with no copying, which provides not only efficiency but also safety (correctness) for e.g. exceptions. And an array of const char is the most clear indication of literal that we have in C++11 and later.

However, a std::string can't do this: it's not designed for it.


If this is often done then one might define a function like this:

using Size = ptrdiff_t;

template< Size n >
auto string_from_data( char const (&data)[n] )
    -> std::string
{ return std::string( data, data + n ); }

Then one can write just

string const s = string_from_data( "*.com\0*.exe\0*.bat\0*.cmd\0" );

Disclaimer: none of the code touched or seen by a compiler.


[I missed this on a first writing, but was reminded by Hurkyl's answer. Now heading for coffee!]

A C++14 string type literal chops off the final \0, so with such literal the above would have to include that terminating nullvalue explicitly:

string const s = "*.com\0*.exe\0*.bat\0*.cmd\0\0"s;

Apart from that, C++14 string type literals appear to provide the sought for convenience.

这篇关于为什么不允许std :: string从字符串数组初始化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆