stringstream str(),缓冲区溢出 [英] stringstream str(), buffer overflows

查看:129
本文介绍了stringstream str(),缓冲区溢出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在编写一些代码来演示不同的本地搜索

策略时,我发现了一些不寻常的东西。我怀疑这是我的理解中的错误,而不是GCC中的错误,我希望有人在这里

可以帮助我。


任务很简单:使用纯爬山

策略解决游戏Boggle,返回以制表符分隔的单词列表作为char *。 (我是

与Python接口,所以char *是必要的;这似乎是从b ++代码中获取Python字符串的最简单方法。)

读取数据我最终超出分配的

空间的末尾,尽管乍一看它似乎正在做的事情

吧。


我在这里重新输入有问题的代码。为清楚起见,我从一些调用中取消了

std ::前缀,但这些应该是显而易见的。

以下是方法,而不是函数;它在

头文件中内联声明。


=====

char * words()const

{

stringstream ss;

ostream_iterator< string> oi(ss," \t");

copy(_wordset-> begin(),_ wordset-> end(),oi);

/ / stringstream在它的末尾有一个尾随的''\t''

//我们不打算复制它。这将变为

//尾随''\0''。

char * rv = new char [ss.str()。size()];

memset((void *)rv,0,ss.str()。size());

copy(ss.str()。begin(),ss .str()。end() - 1,rv);

返回rv;

}

=====

在OS X上,此代码按预期工作。在Win32和Debian上,SIGSEGV(或其等价的
)被捕获,违规行是

复制调用。用
取代方法的最后两行
string t = ss.str()。substr(0,ss.str()。size() - 1);

copy(t.begin(),t.end(),rv);

返回rv;


....使但是一切都运行得很好。


任何人都可以在

理解中给我一个清晰,简洁的错误描述吗?或者这是GCC中的一个错误?

解决方案

Robert J. Hansen写道:

stringstream ss;
[...] char * rv = new char [ss.str()。size()];
memset((void *)rv,0,ss.str()。size() );
复制(ss.str()。begin(),ss.str()。end() - 1,rv);
return rv;
}




这是一个很好的:-)请注意,字符串

流的''str()''方法返回字符串* by value *也就是说,对于每次调用

''str()'',你都会得到一份新的副本。在上面的代码中,我发现了四份总共

,这很容易变成不必要的性能问题,即使代码恰好适用于任何不幸的事情。

你遇到的原因(一个condidate可能是使用某种形式的

引用计数副本,其结果序列实际上是一个

有效的一个)。由于有多个副本,你试图通过使用一个

序列的开始迭代器和另一个的结束迭代器来迭代无效范围。除了由此产生的不同大小的

范围之外,你还冒着试图在两者之间访问任意内存的危险。


因此,你应该只获取一次字符串并从那里复制它。

我会写上面复制代码的部分,如下面的

,这也避免了不必要的调用''memset()''

只是分配将被覆盖的值:


char * to_c_string(std :: string const& ; str){

char * rc = new char [str.size()];

* std :: copy(str.begin(),str.end( ) - 1,rc)* 0;

返回rc;

}


...

返回to_c_string(ss.str());

-

< mailto:di *********** @ yahoo.com> ; < http://www.dietmar-kuehl.de/>

< http://www.eai-systems.com> - 高效的人工智能


>请注意,字符串流的''str()''方法返回

字符串*按值*




谢谢!这正是我所遗漏的东西 - 出于某种原因我认为它是通过引用返回的。
。鉴于此,你的言论

关于字符串对象的不必要构造是很好的,

尽管它在这个问题上没有性能损失(代码执行

亚秒时间。)


再一次,感谢您的回复!

2006年2月9日星期四16:53:37 +0100,Dietmar Kuehl

< di *********** @ yahoo.com>写道:

Robert J. Hansen写道:

stringstream ss; [...]

char * rv = new char [ss.str ().size()];
memset((void *)rv,0,ss.str()。size());
copy(ss.str()。begin(),ss .str()。end() - 1,rv);
返回rv;
}



这是一个很好的:-)请注意''str()''字符串
流的方法返回字符串* by value *,即每次调用
''str()''你得到一个新的副本。




代码的主要问题是它混合了低级别(memset,

char [])和(不充分理解)高级构造

(stringstream,string,copy)。这个混合物总是表示代码中的问题是


在上面的代码中,我发现了四个副本的总数,很容易变成不必要的性能问题即使代码恰好适用于您遇到的任何不幸的原因(一个condidate可能是使用某种形式的
引用计数副本,其结果序列实际上是一个
有效的一个)。由于具有多个副本,您尝试使用一个
序列的begin迭代器和另一个序列的结束迭代器来迭代无效范围。除了产生不同大小的范围之外,还存在尝试访问其间任意内存的危险。

因此,您应该只获取一次字符串并复制从那里开始。
我会写上面的部分复制代码,如下面的
,这也避免了对''memset()'的不必要的调用,它只是分配了值无论如何都会被覆盖:

char * to_c_string(std :: string const& str){
char * rc = new char [str.size()];
* std :: copy(str.begin(),str.end() - 1,rc)* 0;


不知道以上行是做什么的......

返回rc;
}

。 ..
返回to_c_string(ss.str());




当然,将新编辑的对象返回给调用者(由调用者删除)

风格不好。非常糟糕的风格,IMO。


祝你好运,

Roland Pibinger


While writing some code to demonstrate different local search
strategies, I found something kind of unusual. I suspect it''s a bug in
my understanding rather than a bug in GCC, and I''m hoping someone here
can help me out.

The task is simple: solve the game Boggle using a pure hill-climbing
strategy, returning a tab-separated list of words as a char*. (I''m
interfacing with Python, so the char* is necessary; that seems to be
the easiest way to get Python strings back from C++ code.) When
reading the data out I wind up overrunning the end of the allocated
space, despite the fact that at first blush it appears I''m doing things
right.

I''m retyping the offending code here. For clarity I''m leaving off the
std:: prefix from some calls, but these should be obvious. The
following is a method, not a function; it''s declared inline in the
header file.

=====
char* words() const
{
stringstream ss;
ostream_iterator<string> oi(ss, "\t");
copy(_wordset->begin(), _wordset->end(), oi);
// The stringstream has a trailing ''\t'' at the end of it
// which we''re not going to copy. This will turn into
// a trailing ''\0''.
char* rv = new char[ss.str().size()];
memset((void*) rv, 0, ss.str().size());
copy(ss.str().begin(), ss.str().end() - 1, rv);
return rv;
}
=====

On OS X, this code works as expected. On Win32 and Debian, SIGSEGV (or
its Windows equivalent) is caught, with the offending line being the
call to copy. Replacing the last two lines of the method with

string t = ss.str().substr(0, ss.str().size() - 1);
copy(t.begin(), t.end(), rv);
return rv;

.... makes everything work just fine, though.

Can anyone give me a clear, concise description of my error in
understanding? Or is this a bug in GCC?

解决方案

Robert J. Hansen wrote:

stringstream ss; [...] char* rv = new char[ss.str().size()];
memset((void*) rv, 0, ss.str().size());
copy(ss.str().begin(), ss.str().end() - 1, rv);
return rv;
}



That''s a good one :-) Note that the ''str()'' method of the string
stream returns the string *by value* that is, for each call to
''str()'' you get a fresh copy. In the above code I spotted a total
of four copies which can easily turn into an unnecessary performance
problem even if the code happens to work for whatever unfortunate
reason you encounter (a condidate could be the use of some form of
reference counted copy where the resulting sequence is actually a
valid one). As a result of having multiple copies, you try to
iterate over an invalid range by using the begin iterator of one
sequence and the end iterator of another. Apart from the resulting
range being a different size you also run the danger of trying to
access arbitrary memory in between.

Thus, you should get the string only once and copy it from there.
I''d write the above portion of copying the code something like
below which also avoids the unnecessary call to ''memset()'' which
merely assigns values which will be overridden anyway:

char* to_c_string(std::string const& str) {
char* rc = new char[str.size()];
*std::copy(str.begin(), str.end() - 1, rc) * 0;
return rc;
}

...
return to_c_string(ss.str());
--
<mailto:di***********@yahoo.com> <http://www.dietmar-kuehl.de/>
<http://www.eai-systems.com> - Efficient Artificial Intelligence


> Note that the ''str()'' method of the string stream returns

the string *by value*



Thanks! This is exactly the thing I was missing--for some reason I
thought it was returning by reference. In light of this, your remarks
about unnecessary construction of string objects is well-taken,
although it''s not a performance hit in this problem (the code executes
in sub-second time).

Once more, thanks for the reply!


On Thu, 09 Feb 2006 16:53:37 +0100, Dietmar Kuehl
<di***********@yahoo.com> wrote:

Robert J. Hansen wrote:

stringstream ss;[...]

char* rv = new char[ss.str().size()];
memset((void*) rv, 0, ss.str().size());
copy(ss.str().begin(), ss.str().end() - 1, rv);
return rv;
}



That''s a good one :-) Note that the ''str()'' method of the string
stream returns the string *by value* that is, for each call to
''str()'' you get a fresh copy.



The main problem with the code is that it mixes low-level (memset,
char[]) and (insufficiently understood) high level constructs
(stringstream, string, copy). That mixture is always an indication for
problems in the code.
In the above code I spotted a total
of four copies which can easily turn into an unnecessary performance
problem even if the code happens to work for whatever unfortunate
reason you encounter (a condidate could be the use of some form of
reference counted copy where the resulting sequence is actually a
valid one). As a result of having multiple copies, you try to
iterate over an invalid range by using the begin iterator of one
sequence and the end iterator of another. Apart from the resulting
range being a different size you also run the danger of trying to
access arbitrary memory in between.

Thus, you should get the string only once and copy it from there.
I''d write the above portion of copying the code something like
below which also avoids the unnecessary call to ''memset()'' which
merely assigns values which will be overridden anyway:

char* to_c_string(std::string const& str) {
char* rc = new char[str.size()];
*std::copy(str.begin(), str.end() - 1, rc) * 0;
Don''t know what the above line does ...
return rc;
}

...
return to_c_string(ss.str());



Of course, returning a new-ed object to the caller (delete by caller)
is bad style. Very bad style, IMO.

Best regards,
Roland Pibinger


这篇关于stringstream str(),缓冲区溢出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆