推荐的方式来跟踪数组中的C程序外绑定访问/写 [英] Recommended way to track down array out-of-bound access/write in C program
问题描述
考虑在C.一些不那么明显的算法,例如编写实施让它是递归的快速排序,我在KN国王已经找到C程序设计:一种现代方法,第二版的书,它的可从< A HREF =http://knking.com/books/c2/programs/qsort.c>此处。最有趣的部分包括以下两个定义:
无效快速排序(int类型的[],INT低,诠释高)
{
INT中间; 如果(低&GT; =高)
返回; 中间=拆分(一,低,高);
快速排序(A,低,中 - 1);
快速排序(A,中+ 1,高);
}INT拆分(int类型的[],INT低,诠释高)
{
INT part_element = A [小] 为(;;){
而(低&LT;&高功放;&安培; part_element&LT = A(高))
高 - ;
如果(低&GT; =高)
打破;
一个[低++] = A [高]; 而(低&LT;&高功放;&放大器;一个[小]&LT; = part_element)
低++;
如果(低&GT; =高)
打破;
一个[high--] = A [小]
} 一个[高] = part_element;
高回报;
}
两者,而
循环可以通过删除优化低&LT;高
测试:
为(;;){
而(part_element&LT; A(高))
高 - ;
如果(低&GT; =高)
打破;
一个[低++] = A [高];
一个[高] = part_element; 而(A [小]&LT; = part_element)
低++;
如果(低&GT; =高)
打破;
一个[high--] = A [小]
一个[小] = part_element;
}
什么是使推荐的方式,以确保每一个访问或写入到阵列(分配上的栈)实际上是有效的(即不是挑起未定义行为)?我已经尝试是:
- 与
GDB
手动调试的一些实际数据 - 通源$ C $ C到静态分析工具,如
拆分
或cppcheck
-
的valgrind
与- 工具= EXP-sgcheck
开关
例如有五行阵 {8,1,2,3,4}
:
的#define N 5INT主要(无效)
{
诠释一个[N] = {8,1,2,3,4},I; 快速排序(一,0,N - 1); 的printf(后排序:);
对于(i = 0; I&LT; N;我++)
的printf(%d个,一个[我]);
的putchar('\\ n'); 返回0;
}
结果是(肯定它的FPGA实现依赖):
排序后:1 1 2 4 8
1。 GDB
(GDB)p低
$ 1 = 3
(GDB)普高
$ 2 = 4
(GDB)P A [小]
$ 3 = 1
(GDB)p part_element
$ 4 = 8
(GDB)■
47低++;
(GDB)■
46时(A [小]&LT; = part_element)
(GDB)■
47低++;
(GDB)■
46时(A [小]&LT; = part_element)
(GDB)p低
$ 5 = 5
(GDB)普高
$ 6 = 4
(GDB)BT全
#0分(A = 0x7fffffffe140,低= 5,高= 4)qsort.c:46
part_element = 8
在快速排序#1 0x00000000004005df在qsort.c(A = 0x7fffffffe140,低= 0,高= 4):30
中间= LT;价值优化掉了&GT;
#2 0x0000000000400656在main()在qsort.c:14
一个= {4,1,2,1,8}
我= LT;价值优化掉了&GT;
正如你所看到低
变量跑到外面边界:
(GDB)p低
$ 5 = 5
2。静态分析工具
$夹板-retvalint -exportlocal qsort.c
3.1.2夹板07 --- 2011年2月检查完---没有警告$ cppcheck qsort.c
检查qsort.c ...
3。 Valgrind的与 - 工具= EXP-sgcheck
$的valgrind --tool = EXP-sgcheck ./a.out
== == 5480 EXP-sgcheck,堆栈和全局数组超限检测仪
== == 5480注:这是一个实验级Valgrind的工具
== == 5480版权所有(C)2003-2012,和GNU GPL的,由OpenWorks有限公司等。
== == 5480 Valgrind的使用,3.8.1和LibVEX;与-h版权信息重新运行
== == 5480命令:./a.out
== == 5480
== == 5480大小为4的读取无效
== == 5480在0x4005A0:斯普利特(qsort.c:46)
== == 5480通过0x4005DE:快速排序(qsort.c:30)
== == 5480由0x400655:主(qsort.c:14)
== == 5480地址0x7ff000114预期与实际对比:
== == 5480预计:在框架2尺寸20叠阵是从这里回
== == 5480实际:未知
== == 5480实际:0成功后
== == 5480
1 1 2 4 8:排序后
== == 5480
== == 5480错误摘要:从1上下文1错误(SUP pressed:0 0)
的在0x4005A0位置:斯普利特(qsort.c:46)
是匹配了同一个地方,因为我发现通过 GDB
手动
什么是确保每一个访问或写入到阵列(分配堆栈)的推荐方式实际上是有效的(即不是挑起未定义行为)?
块引用>如果在Linux上使用
铛
通过选项-fsanitize =地址
和- fsanitize =未定义
?它也可在GCC
:的http:// GCC。 gnu.org/gcc-4.8/changes.html 。
铛
通过选项-fsanitize =未定义
这是一个例子:
的#include&LT;&stdlib.h中GT;#定义N 5INT主(INT ARGC,CHAR *的argv [])
{
诠释一个[N] = {8,1,2,3,4},I; INT R = 0;
INT结束=的atoi(ARGV [1]);
的for(int i = 0;!I =结束; ++ I)
R + = A []; 返回ř;
}然后
铛-fno-省略帧指针-fsanitize =未定义-g out_boundary.c -o out_boundary_clang
$ ./out_boundary_clang 5
$ ./out_boundary_clang 6
out_boundary.c:12:10:运行时错误:指数5出界类型的'INT [5]
非法指令(核心转储)和再分析一个核心文件
程序与信号4,非法指令终止。
#0主(ARGC = 2,argv的= 0x7fff3a1c28c8)在out_boundary.c:12
12 R + = A [];
(GDB)P I
$ 1 = 5
铛
通过选项-fsanitize =地址
这是一个报价:
该工具可以检测到以下类型的错误:*超出边界访问堆,栈和全局
*释放后使用,
*释放后使用收益率(一定程度上)
*双免费,免费无效
*内存泄漏(实验)
铛-fno-省略帧指针-fsanitize =地址-g out_boundary.c -o out_boundary_clang
和则:
$ ./out_boundary_clang 6 2 - ;&放大器; 1 | asan_symbolize.py
================================================== ===============
== == 9634 ERROR:AddressSanitizer:堆栈缓冲区溢出在PC上的地址0x7fff91bb2ad4 0x459c67 BP 0x7fff91bb2910 SP 0x7fff91bb2908
在0x7fff91bb2ad4线程T0读取大小4
在主out_boundary.c#0 0x459c66:12
#1 0x3a1d81ed1c在__libc_start_main ??:0
#2 0x4594ac在_start ??:0
地址0x7fff91bb2ad4位于线程T0的堆栈帧偏移244
在主out_boundary.c#0 0x45957f:6
这个框架有8个目标(S):
[32,36)'
[96,100)'
[160,168)''
[224,244)'A'
[288,292)'我'
[352,356)'R'
[416,420)结束
[480,484)I1
提示:这可能是一个误报,如果你的程序使用了自定义堆栈展开的机制或swapcontext
(longjmp的和C ++异常* *都支持)
暗影字节各地的车地址:
0x10007236e500:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007236e510:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007236e520:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007236e530:00 00 00 00 00 00 00 00 00 00 00 00 F1 F1 F1 F1
0x10007236e540:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F2 F2 F2 F2
= GT; 0x10007236e550:00 F4 F4 F4 F2 F2 F2 F2 00 00 [04] F4 F2 F2 F2 F2
0x10007236e560:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F2 F2 F2 F2
0x10007236e570:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F3 F3 F3 F3
0x10007236e580:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007236e590:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x10007236e5a0:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
影传说字节(一个字节的阴影重新presents 8应用字节):
寻址:00
部分寻址:01 02 03 04 05 06 07
堆左redzone后面:发
堆右redzone后面:FB
释放堆区:FD
堆栈留下redzone后面:F1
堆栈中旬redzone后面:F2
堆栈右redzone后面:F3
堆栈部分redzone后面:F4
返回后堆栈:F5
F8:在范围堆栈使用
全球redzone后面:F9
全局初始化命令:F6
用户中毒:F7
阿三内部:FE
== == 9634中止或者,您可以同时使用此选项。相关链接:
- http://clang.llvm.org/docs/UsersManual.html#controlling-$c$c-generation
- http://developer.mozilla.org/en-US/docs/Mozilla/Testing/Firefox_and_Address_Sanitizer
- http://clang.llvm.org/docs/AddressSanitizer.html
Consider writing implementation for some not-so-obvious algorithm in C. For example let it be recursive quicksort, that I have found in K. N. King's "C Programming: A Modern Approach, 2nd Edition" book, that it's available from here. The most interesting part consist of two following definitions:
void quicksort(int a[], int low, int high) { int middle; if (low >= high) return; middle = split(a, low, high); quicksort(a, low, middle - 1); quicksort(a, middle + 1, high); } int split(int a[], int low, int high) { int part_element = a[low]; for (;;) { while (low < high && part_element <= a[high]) high--; if (low >= high) break; a[low++] = a[high]; while (low < high && a[low] <= part_element) low++; if (low >= high) break; a[high--] = a[low]; } a[high] = part_element; return high; }
Both
while
loops can be optimized by removinglow < high
tests:for (;;) { while (part_element < a[high]) high--; if (low >= high) break; a[low++] = a[high]; a[high] = part_element; while (a[low] <= part_element) low++; if (low >= high) break; a[high--] = a[low]; a[low] = part_element; }
What is the recommended way to make sure that every access or write to array (allocated on stack) is actually valid (i.e. not provoking undefined behaviour) ? What I already tried is to:
- manually debug with
gdb
on some actual data- pass source code to static analysis tools like
split
orcppcheck
valgrind
with--tool=exp-sgcheck
switchFor example having five elements array
{8, 1, 2, 3, 4}
:#define N 5 int main(void) { int a[N] = {8, 1, 2, 3, 4}, i; quicksort(a, 0, N - 1); printf("After sort:"); for (i = 0; i < N; i++) printf(" %d", a[i]); putchar('\n'); return 0; }
Result is (most certainly it's implemention dependent):
After sort: 1 1 2 4 8
1. GDB
(gdb) p low $1 = 3 (gdb) p high $2 = 4 (gdb) p a[low] $3 = 1 (gdb) p part_element $4 = 8 (gdb) s 47 low++; (gdb) s 46 while (a[low] <= part_element) (gdb) s 47 low++; (gdb) s 46 while (a[low] <= part_element) (gdb) p low $5 = 5 (gdb) p high $6 = 4 (gdb) bt full #0 split (a=0x7fffffffe140, low=5, high=4) at qsort.c:46 part_element = 8 #1 0x00000000004005df in quicksort (a=0x7fffffffe140, low=0, high=4) at qsort.c:30 middle = <value optimized out> #2 0x0000000000400656 in main () at qsort.c:14 a = {4, 1, 2, 1, 8} i = <value optimized out>
As you see
low
variable went outside boundary:(gdb) p low $5 = 5
2. Static analysis tools
$ splint -retvalint -exportlocal qsort.c Splint 3.1.2 --- 07 Feb 2011 Finished checking --- no warnings $ cppcheck qsort.c Checking qsort.c...
3. Valgrind with
--tool=exp-sgcheck
$ valgrind --tool=exp-sgcheck ./a.out ==5480== exp-sgcheck, a stack and global array overrun detector ==5480== NOTE: This is an Experimental-Class Valgrind Tool ==5480== Copyright (C) 2003-2012, and GNU GPL'd, by OpenWorks Ltd et al. ==5480== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==5480== Command: ./a.out ==5480== ==5480== Invalid read of size 4 ==5480== at 0x4005A0: split (qsort.c:46) ==5480== by 0x4005DE: quicksort (qsort.c:30) ==5480== by 0x400655: main (qsort.c:14) ==5480== Address 0x7ff000114 expected vs actual: ==5480== Expected: stack array "a" of size 20 in frame 2 back from here ==5480== Actual: unknown ==5480== Actual: is 0 after Expected ==5480== After sort: 1 1 2 4 8 ==5480== ==5480== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
The location
at 0x4005A0: split (qsort.c:46)
is matching to the same place as I found bygdb
manually.解决方案What is the recommended way to make sure that every access or write to array (allocated on stack) is actually valid (i.e. not provoking undefined behaviour) ?
What if use
clang
on Linux with the options-fsanitize=address
and-fsanitize=undefined
? It is also available ingcc
: http://gcc.gnu.org/gcc-4.8/changes.html.
clang
with the option-fsanitize=undefined
This is an example:
#include <stdlib.h> #define N 5 int main(int argc, char *argv[]) { int a[N] = {8, 1, 2, 3, 4}, i; int r =0; int end = atoi(argv[1]); for (int i = 0; i != end; ++i) r += a[i]; return r; }
Then
clang -fno-omit-frame-pointer -fsanitize=undefined -g out_boundary.c -o out_boundary_clang
$ ./out_boundary_clang 5 $ ./out_boundary_clang 6 out_boundary.c:12:10: runtime error: index 5 out of bounds for type 'int [5]' Illegal instruction (core dumped)
And then analyze a core file
Program terminated with signal 4, Illegal instruction. #0 main (argc=2, argv=0x7fff3a1c28c8) at out_boundary.c:12 12 r += a[i]; (gdb) p i $1 = 5
clang
with the option-fsanitize=address
This is a quote:
The tool can detect the following types of bugs: * Out-of-bounds accesses to heap, stack and globals * Use-after-free * Use-after-return (to some extent) * Double-free, invalid free * Memory leaks (experimental)
clang -fno-omit-frame-pointer -fsanitize=address -g out_boundary.c -o out_boundary_clang
And then:
$ ./out_boundary_clang 6 2>&1 | asan_symbolize.py ================================================================= ==9634==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff91bb2ad4 at pc 0x459c67 bp 0x7fff91bb2910 sp 0x7fff91bb2908 READ of size 4 at 0x7fff91bb2ad4 thread T0 #0 0x459c66 in main out_boundary.c:12 #1 0x3a1d81ed1c in __libc_start_main ??:0 #2 0x4594ac in _start ??:0 Address 0x7fff91bb2ad4 is located in stack of thread T0 at offset 244 in frame #0 0x45957f in main out_boundary.c:6 This frame has 8 object(s): [32, 36) '' [96, 100) '' [160, 168) '' [224, 244) 'a' [288, 292) 'i' [352, 356) 'r' [416, 420) 'end' [480, 484) 'i1' HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) Shadow bytes around the buggy address: 0x10007236e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10007236e510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10007236e520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10007236e530: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 0x10007236e540: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 =>0x10007236e550: 00 f4 f4 f4 f2 f2 f2 f2 00 00[04]f4 f2 f2 f2 f2 0x10007236e560: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2 0x10007236e570: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f3 f3 f3 f3 0x10007236e580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10007236e590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x10007236e5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==9634==ABORTING
Or you can use both this options. Useful links:
- http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
- http://developer.mozilla.org/en-US/docs/Mozilla/Testing/Firefox_and_Address_Sanitizer
- http://clang.llvm.org/docs/AddressSanitizer.html
这篇关于推荐的方式来跟踪数组中的C程序外绑定访问/写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!