从可能的NULL char指针初始化std :: string [英] Initialize std::string from a possibly NULL char pointer
问题描述
从 NULL
char 指针初始化 std :: string
是未定义的行为,我相信。因此,这里是一个构造函数的替代版本,其中 mStdString
是 std :: string
/ p>
void MyClass :: MyClass(const char * cstr):
mStdString(cstr?cstr:)
{}
void MyClass :: MyClass(const char * cstr):
mStdString(cstr?std :: string(cstr):std :: string())
{}
void MyClass :: MyClass(const char * cstr)
{
if(cstr)mStdString = cstr;
// else保留默认构造的mStdString
}
class MyClass
:
MyClass(const char * cstr = NULL) ;
这些或其他可能是最好的或最适当的方法来初始化 std :: string
从一个可能的 NULL
指针,为什么?不同的C ++标准是不同的?假设正常发布版本的优化设置。
的我在寻找与为什么一种方式是正确的方式,或以引用链接的答案解释答案(这也适用于如果回答是不重要),而不只是个人意见(但如果你必须,至少只是一个评论)。
最后一个是愚蠢的,因为它不会使用初始化。
前两个完全相同语义(想想 c_str的()
成员函数),所以更喜欢第一个版本,因为它是最直接,最惯用,并且最容易阅读。
(如果 std :: string
有一个 constexpr,则 默认构造函数,但它不会。
std :: string()
,但我不知道这样做的任何实现,因为它似乎没有什么意义。在另一方面,流行小串的优化时下意味着两个版本将可能的不的执行任何动态分配。)
的更新:的作为@乔纳森指出,两个字符串的构造函数可能会执行不同的代码,如果这对你很重要(尽管它真的不应该),你可能会考虑第四版本:
:cstr? cstr:std :: string()
可读和默认构造。
第二次更新:但更喜欢 cstr? cstr:
。如下所示,当两个分支都调用相同的构造函数时,这可以使用有条件的移动和没有分支来非常有效地实现。 (所以两个版本确实产生不同的代码,但第一个更好。)
对于giggle, ve通过Clang 3.3运行两个版本,在x86_64上用 -O3
,对于 struct foo;
function foo bar(char const * p){return p; }
:
默认构造函数( std :: string()
) :
.cfi_offset r14,-16
mov R14,RSI
mov RBX, RDI
test R14,R14
je .LBB0_2
mov RDI,R14
call strlen
mov RDI,RBX
mov RSI,R14
MOV RDX,RAX
通话_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm
JMP .LBB0_3
.LBB0_2:
xorps XMM0,XMM0
MOVUPS XMMWORD PTR [RBX],XMM0
MOV QWORD PTR [RBX + 16],0
.LBB0_3:
mov RAX,RBX
add RSP,8
pop RBX
pop R14
ret
空字符串构造函数( ):
.cfi_offset r14,-16
mov R14,RDI
mov EBX ,.L.str
test RSI,RSI
cmovne RBX,RSI
mov RDI,RBX
call strlen
mov RDI,R14
mov RSI, RBX
MOV RDX,RAX
通话_ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm
MOV RAX,R14
添加RSP,8
流行RBX
流行R14
沤
.L.str:
.zero 1
.size .L.str,1
在我的情况下,甚至会出现生成更好的代码:两个版本调用
的strlen
,但空字符串版本不使用任何的跳跃,只是有条件的移动(因为相同的构造函数被调用,只需用两个不同的参数)。当然,这是一个完全没有意义的,不可移植的,不可转让的观察,但它只是表明,编译器并不总是需要像你想尽可能多的帮助。只要写出看起来最好的代码。
Initializing std::string
from a NULL
char pointer is undefined behaviour, I believe. So, here are alternative versions of a constructor, where mStdString
is a member variable of type std::string
:
void MyClass::MyClass(const char *cstr) :
mStdString( cstr ? cstr : "")
{}
void MyClass::MyClass(const char *cstr) :
mStdString(cstr ? std::string(cstr) : std::string())
{}
void MyClass::MyClass(const char *cstr)
{
if (cstr) mStdString = cstr;
// else keep default-constructed mStdString
}
Edit, constructor declaration inside class MyClass
:
MyClass(const char *cstr = NULL);
Which of these, or possibly something else, is the best or most proper way to initialize std::string
from a possibly NULL
pointer, and why? Is it different for different C++ standards? Assume normal release build optimization flags.
I'm looking for an answer with explanation of why a way is the right way, or an answer with a reference link (this also applies if answer is "doesn't matter"), not just personal opinions (but if you must, at least make it just a comment).
The last one is silly because it doesn't use initialization when it could.
The first two are completely identical semantically (think of the c_str()
member function), so prefer the first version because it is the most direct and idiomatic, and easiest to read.
(There would be a semantic difference if std::string
had a constexpr
default constructor, but it doesn't. Still, it's possible that std::string()
is different from std::string("")
, but I don't know any implementations that do this, since it doesn't seem to make a lot of sense. On the other hand, popular small-string optimizations nowadays mean that both versions will probably not perform any dynamic allocation.)
Update: As @Jonathan points out, the two string constructors will probably execute different code, and if that matters to you (though it really shouldn't), you might consider a fourth version:
: cstr ? cstr : std::string()
Both readable and default-constructing.
Second update: But prefer cstr ? cstr : ""
. As you can see below, when both branches call the same constructor, this can be implemented very efficiently using conditional moves and no branches. (So the two versions do indeed generate different code, but the first one is better.)
For giggles, I've run both versions through Clang 3.3, with -O3
, on x86_64, for a struct foo;
like yours and a function foo bar(char const * p) { return p; }
:
Default constructor (std::string()
):
.cfi_offset r14, -16
mov R14, RSI
mov RBX, RDI
test R14, R14
je .LBB0_2
mov RDI, R14
call strlen
mov RDI, RBX
mov RSI, R14
mov RDX, RAX
call _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm
jmp .LBB0_3
.LBB0_2:
xorps XMM0, XMM0
movups XMMWORD PTR [RBX], XMM0
mov QWORD PTR [RBX + 16], 0
.LBB0_3:
mov RAX, RBX
add RSP, 8
pop RBX
pop R14
ret
Empty-string constructor (""
):
.cfi_offset r14, -16
mov R14, RDI
mov EBX, .L.str
test RSI, RSI
cmovne RBX, RSI
mov RDI, RBX
call strlen
mov RDI, R14
mov RSI, RBX
mov RDX, RAX
call _ZNSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm
mov RAX, R14
add RSP, 8
pop RBX
pop R14
ret
.L.str:
.zero 1
.size .L.str, 1
In my case, it would even appear that ""
generates better code: Both versions call strlen
, but the empty-string version doesn't use any jumps, only conditional moves (since the same constructor is called, just with two different arguments). Of course that's a completely meaningless, non-portable and non-transferable observation, but it just goes to show that the compiler doesn't always need as much help as you might think. Just write the code that looks best.
这篇关于从可能的NULL char指针初始化std :: string的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!