如何检测难以捉摸的 64 位可移植性问题? [英] How can elusive 64-bit portability issues be detected?

查看:16
本文介绍了如何检测难以捉摸的 64 位可移植性问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在准备用于 64 位端口的一些 (C++) 代码中发现了与此类似的片段.

int n;size_t 位置,npos;/* ... 初始化 ... */while((pos = find(ch, start)) != npos){/* ...提前开始位置... */n++;//如果循环迭代太多次,这将溢出}

虽然我严重怀疑这实际上会在内存密集型应用程序中引起问题,但从理论的角度来看还是值得的,因为类似的错误可能会出现,导致问题.(在上面的示例中将 n 更改为 short,即使是小文件也可能溢出计数器.)

静态分析工具很有用,但它们无法直接检测此类错误.(无论如何还没有.)计数器 n 根本不参与 while 表达式,所以这不像其他循环那么简单(类型转换错误给出错误消失).任何工具都需要确定循环将执行超过 231 次,但这意味着它需要能够估计表达式 (pos = find(ch, start)) != npos 将评估为真——这可不是一件小事!即使工具可以确定循环可以执行超过 231 次(比如说,因为它识别出 find 函数正在处理一个string),它怎么知道循环不会执行超过 264 次,也会溢出 size_t 值?

似乎很明显,要最终识别和修复此类错误需要人眼,但是否存在泄露此类错误的模式,以便可以手动检查?我应该注意哪些类似的错误?

EDIT 1: 由于 shortintlong 类型本身就有问题,这种错误可能通过检查这些类型的每个实例来找到.但是,鉴于它们在遗留 C++ 代码中无处不在,我不确定这对于大型软件是否实用.还有什么会导致这个错误?每个 while 循环都可能会出现这样的错误吗?(for 循环当然不能幸免!)如果我们不处理像 short 这样的 16 位类型,这种错误有多严重?

编辑 2: 这是另一个示例,显示此错误如何出现在 for 循环中.

int i = 0;for (iter = c.begin(); iter != c.end(); iter++, i++){/* ... */}

这基本上是同一个问题:循环依赖于一些从不直接与更广泛的类型交互的变量.变量仍然可能溢出,但没有编译器或工具检测到转换错误.(严格来说,没有.)

编辑 3:我正在使用的代码非常很大.(仅 C++ 就有 10-15 百万行代码.)检查所有这些代码是不可行的,所以我对自动识别此类问题的方法(即使它导致高误报率)特别感兴趣.

解决方案

代码审查.让一群聪明人查看代码.

shortintlong 的使用是一个警告信号,因为标准中没有定义这些类型的范围.大多数用法应该更改为 中新的 int_fastN_t 类型,用法处理序列化为 intN_t.嗯,实际上这些 类型应该用于 typedef 新的应用程序特定类型.

这个例子应该是:

typedef int_fast32_t linecount_appt;linecount_appt n;

这表达了一种设计假设,即 linecount 适合 32 位,并且如果设计要求发生变化,也可以轻松修复代码.

I found a snippet similar to this in some (C++) code I'm preparing for a 64-bit port.

int n;
size_t pos, npos;

/* ... initialization ... */

while((pos = find(ch, start)) != npos)
{
    /* ... advance start position ... */

    n++; // this will overflow if the loop iterates too many times
}

While I seriously doubt this would actually cause a problem in even memory-intensive applications, it's worth looking at from a theoretical standpoint because similar errors could surface that will cause problems. (Change n to a short in the above example and even small files could overflow the counter.)

Static analysis tools are useful, but they can't detect this kind of error directly. (Not yet, anyway.) The counter n doesn't participate in the while expression at all, so this isn't as simple as other loops (where typecasting errors give the error away). Any tool would need to determine that the loop would execute more than 231 times, but that means it needs to be able to estimate how many times the expression (pos = find(ch, start)) != npos will evaluate as true—no small feat! Even if a tool could determine that the loop could execute more than 231 times (say, because it recognizes the find function is working on a string), how could it know that the loop won't execute more than 264 times, overflowing a size_t value, too?

It seems clear that to conclusively identify and fix this kind of error requires a human eye, but are there patterns that give away this kind of error so it can be manually inspected? What similar errors exist that I should be watchful for?

EDIT 1: Since short, int and long types are inherently problematic, this kind of error could be found by examining every instance of those types. However, given their ubiquity in legacy C++ code, I'm not sure this is practical for a large piece of software. What else gives away this error? Is each while loop likely to exhibit some kind of error like this? (for loops certainly aren't immune to it!) How bad is this kind of error if we're not dealing with 16-bit types like short?

EDIT 2: Here's another example, showing how this error appears in a for loop.

int i = 0;
for (iter = c.begin(); iter != c.end(); iter++, i++)
{
    /* ... */
}

It's fundamentally the same problem: loops are counting on some variable that never directly interacts with a wider type. The variable can still overflow, but no compiler or tool detects a casting error. (Strictly speaking, there is none.)

EDIT 3: The code I'm working with is very large. (10-15 million lines of code for C++ alone.) It's infeasible to inspect all of it, so I'm specifically interested in ways to identify this sort of problem (even if it results in a high false-positive rate) automatically.

解决方案

Code reviews. Get a bunch of smart people looking at the code.

Use of short, int, or long is a warning sign, because the range of these types isn't defined in the standard. Most usage should be changed to the new int_fastN_t types in <stdint.h>, usage dealing with serialization to intN_t. Well, actually these <stdint.h> types should be used to typedef new application-specific types.

This example really ought to be:

typedef int_fast32_t linecount_appt;
linecount_appt n;

This expresses a design assumption that linecount fits in 32 bits, and also makes it easy to fix the code if the design requirements change.

这篇关于如何检测难以捉摸的 64 位可移植性问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆