我如何安全和明智地确定指针是否指向某个指定的缓冲区? [英] How do I safely and sensibly determine whether a pointer points somewhere into a specified buffer?
问题描述
我想实现一个函数来确定给定的指针是否指向一个给定的缓冲区。规范:
I'm looking to implement a function that determines whether a given pointer points into a given buffer. The specification:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len);
如果有一些 n
c $ c> 0< = n&& n < len , p == buf + n
,返回 true
。
If there is some n
, 0 <= n && n < len
, for which p == buf + n
, returns true
.
否则,如果有一些 n
, 0 <= n& n < len * sizeof(T)
,其中 reinterpret_cast< char *>(p)== reinterpret_cast< char *>(buf)+ n
,则该行为未定义。
Otherwise, if there is some n
, 0 <= n && n < len * sizeof(T)
, for which reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n
, the behaviour is undefined.
否则返回 false
。
显而易见的实现将类似于
The obvious implementation would look something like
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
return p >= buf && p < buf + len;
}
但在标准C ++中有未定义的行为:只定义指针的关系比较
but that has undefined behaviour in standard C++: relational comparisons of pointers are only defined for pointers into the same array.
另一种方法是使用标准库的comparer对象:
An alternative would be to use the standard library's comparer objects:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
return std::greater_equal<T *>()(p, buf) && std::less<T *>()(p, buf + len);
}
确保返回 true
当我想它返回 true
,并避免未定义的行为,但允许假阳性:给定 int a; int b;
,它允许 true 的结果为
points_into_buffer(& a,& b,1) / code>。
which is guaranteed to return true
when I want it to return true
, and avoids undefined behaviour, but allows for false positives: given int a; int b;
, it allows a result of true
for points_into_buffer(&a, &b, 1)
.
它可以实现为一个循环:
It can be implemented as a loop:
template <typename T>
bool points_into_buffer (T *p, T *buf, std::size_t len) {
for (std::size_t i = 0; i != len; i++)
if (p == buf + i)
return true;
return false;
}
但是,编译器无法优化该循环。
However, compilers have trouble optimising away that loop.
是否有一个有效的写法,在当前编译器和优化启用,结果是在常量时间确定?
Is there a valid way of writing this, where with current compilers and optimisations enabled, the result is determined in constant time?
推荐答案
据我所知,这是一个可移植的实现我所有可能的实现后的函数:
As far as I can tell, this is a portable implementation of the function I'm after for all possible implementations:
#ifdef UINTPTR_MAX
bool points_into_buffer(std::uintptr_t p, std::uintptr_t buf, std::size_t len)
{
const auto diff = p + 0u - buf;
if (diff < len)
// #1
if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + diff)
return true;
for (std::size_t n = 0; n != len; n++)
if (reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n)
// #2
if (reinterpret_cast<char *>(p) - reinterpret_cast<char *>(buf) != diff)
return true;
return false;
}
template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
return points_into_buffer(reinterpret_cast<std::uintptr_t>(p),
reinterpret_cast<std::uintptr_t>(buf),
len * sizeof(T));
}
#else
template <typename T>
bool points_into_buffer(T *p, T *buf, std::size_t len)
{
for (std::size_t n = 0; n != len; n++)
if (p == buf + n)
return true;
return false;
}
#endif
一般来说, diff
不能保证有一个有意义的值。但是没关系:当且仅当它找到一些 n
时,函数返回 true
,使得 reinterpret_cast< char *>(p)== reinterpret_cast< char *>(buf)+ n
。它只使用 diff
作为提示来更快地找到 n
的值。
In general, diff
is not guaranteed to have a meaningful value. But that's okay: the function returns true
if and only if it finds some n
such that reinterpret_cast<char *>(p) == reinterpret_cast<char *>(buf) + n
. It only uses diff
as a hint to find the value of n
faster.
它依赖于编译器优化条件,这些条件在编译时一般不一定是已知的,但在编译时对于特定平台是已知的。标记为#1
和#2 $ c $的
被定义,允许GCC看到在循环内没有执行有用的动作,并允许删除整个循环。 if
语句的条件c>分别由GCC在编译时确定为 true
和 false
c> diff
It relies on the compiler optimising conditions that are not necessarily known at compile time in general, but are known at compile time for a particular platform. The conditions for the if
statements marked as #1
and #2
are determined by GCC at compile time to always be true
and false
respectively, because of how diff
is defined, allowing GCC to see that no useful action is performed inside the loop, and allowing the entire loop to be dropped.
points_into_buffer< char>
和 points_into_buffer< int>
的生成代码如下:
bool points_into_buffer(char*, char*, unsigned int):
movl 4(%esp), %edx
movl $1, %eax
movl 12(%esp), %ecx
subl 8(%esp), %edx
cmpl %edx, %ecx
ja L11
xorl %eax, %eax
L11: rep ret
bool points_into_buffer(int*, int*, unsigned int):
movl 4(%esp), %edx
movl 12(%esp), %eax
subl 8(%esp), %edx
leal 0(,%eax,4), %ecx
movl $1, %eax
cmpl %edx, %ecx
ja L19
xorl %eax, %eax
L19: rep ret
在 std :: uintptr_t
不可用,或者地址比简单整数更复杂的系统上,将使用循环。
On systems where std::uintptr_t
is not available, or where addresses are more complicated than simple integers, the loop is used instead.
这篇关于我如何安全和明智地确定指针是否指向某个指定的缓冲区?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!