推荐的方式来跟踪数组中的C程序外绑定访问/写 [英] Recommended way to track down array out-of-bound access/write in C program

查看:169
本文介绍了推荐的方式来跟踪数组中的C程序外绑定访问/写的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑在C.一些不那么明显的算法,例如编写实施让它是递归的快速排序,我在KN国王已经找到C程序设计:一种现代方法,第二版的书,它的可从< A HREF =htt​​p://knking.com/books/c2/programs/qsort.c>此处。最有趣的部分包括以下两个定义:

 无效快速排序(int类型的[],INT低,诠释高)
{
    INT中间;    如果(低&GT; =高)
        返回;    中间=拆分(一,低,高);
    快速排序(A,低,中 - 1);
    快速排序(A,中+ 1,高);
}INT拆分(int类型的[],INT低,诠释高)
{
    INT part_element = A [小]    为(;;){
       而(低&LT;&高功放;&安培; part_element&LT = A(高))
           高 - ;
       如果(低&GT; =高)
           打破;
       一个[低++] = A [高];       而(低&LT;&高功放;&放大器;一个[小]&LT; = part_element)
           低++;
       如果(低&GT; =高)
           打破;
       一个[high--] = A [小]
    }    一个[高] = part_element;
    高回报;
}

两者,而循环可以通过删除优化低&LT;高测试:

 为(;;){
    而(part_element&LT; A(高))
        高 - ;
    如果(低&GT; =高)
        打破;
    一个[低++] = A [高];
    一个[高] = part_element;    而(A [小]&LT; = part_element)
        低++;
    如果(低&GT; =高)
        打破;
    一个[high--] = A [小]
    一个[小] = part_element;
}

什么是使推荐的方式,以确保每一个访问或写入到阵列(分配上的)实际上是有效的(即不是挑起未定义行为)?我已经尝试是:


  • GDB 手动调试的一些实际数据

  • 通源$ C ​​$ C到静态分析工具,如拆分 cppcheck

  • 的valgrind - 工具= EXP-sgcheck 开关

例如有五行阵 {8,1,2,3,4}

 的#define N 5INT主要(无效)
{
    诠释一个[N] = {8,1,2,3,4},I;    快速排序(一,0,N - 1);    的printf(后排序:);
    对于(i = 0; I&LT; N;我++)
        的printf(%d个,一个[我]);
    的putchar('\\ n');    返回0;
}

结果是(肯定它的FPGA实现依赖):

 排序后:1 1 2 4 8

1。 GDB

 (GDB)p低
$ 1 = 3
(GDB)普高
$ 2 = 4
(GDB)P A [小]
$ 3 = 1
(GDB)p part_element
$ 4 = 8
(GDB)■
47低++;
(GDB)■
46时(A [小]&LT; = part_element)
(GDB)■
47低++;
(GDB)■
46时(A [小]&LT; = part_element)
(GDB)p低
$ 5 = 5
(GDB)普高
$ 6 = 4
(GDB)BT全
#0分(A = 0x7fffffffe140,低= 5,高= 4)qsort.c:46
        part_element = 8
在快速排序#1 0x00000000004005df在qsort.c(A = 0x7fffffffe140,低= 0,高= 4):30
        中间= LT;价值优化掉了&GT;
#2 0x0000000000400656在main()在qsort.c:14
        一个= {4,1,2,1,8}
        我= LT;价值优化掉了&GT;

正如你所看到变量跑到外面边界:

 (GDB)p低
$ 5 = 5

2。静态分析工具

  $夹板-retvalint -exportlocal qsort.c
3.1.2夹板07 --- 2011年2月检查完---没有警告$ cppcheck qsort.c
检查qsort.c ...

3。 Valgrind的与 - 工具= EXP-sgcheck

  $的valgrind --tool = EXP-sgcheck ./a.out
== == 5480 EXP-sgcheck,堆栈和全局数组超限检测仪
== == 5480注:这是一个实验级Valgrind的工具
== == 5480版权所有(C)2003-2012,和GNU GPL的,由OpenWorks有限公司等。
== == 5480 Valgrind的使用,3.8.1和LibVEX;与-h版权信息重新运行
== == 5480命令:./a.out
== == 5480
== == 5480大小为4的读取无效
== == 5480在0x4005A0:斯普利特(qsort.c:46)
== == 5480通过0x4005DE:快速排序(qsort.c:30)
== == 5480由0x400655:主(qsort.c:14)
== == 5480地址0x7ff000114预期与实际对比:
== == 5480预计:在框架2尺寸20叠阵是从这里回
== == 5480实际:未知
== == 5480实际:0成功后
== == 5480
1 1 2 4 8:排序后
== == 5480
== == 5480错误摘要:从1上下文1错误(SUP pressed:0 0)

在0x4005A0位置:斯普利特(qsort.c:46)是匹配了同一个地方,因为我发现通过 GDB 手动


解决方案

  

什么是确保每一个访问或写入到阵列(分配堆栈)的推荐方式实际上是有效的(即不是挑起未定义行为)?


如果在Linux上使用通过选项 -fsanitize =地址 - fsanitize =未定义?它也可在 GCC 的http:// GCC。 gnu.org/gcc-4.8/changes.html



通过选项 -fsanitize =未定义

这是一个例子:

 的#include&LT;&stdlib.h中GT;#定义N 5INT主(INT ARGC,CHAR *的argv [])
{
  诠释一个[N] = {8,1,2,3,4},I;  INT R = 0;
  INT结束=的atoi(ARGV [1]);
  的for(int i = 0;!I =结束; ++ I)
    R + = A [];  返回ř;
}

然后

铛-fno-省略帧指针-fsanitize =未定义-g out_boundary.c -o out_boundary_clang

  $ ./out_boundary_clang 5
$ ./out_boundary_clang 6
out_boundary.c:12:10:运行时错误:指数5出界类型的'INT [5]
非法指令(核心转储)

和再分析一个核心文件

 程序与信号4,非法指令终止。
#0主(ARGC = 2,argv的= 0x7fff3a1c28c8)在out_boundary.c:12
12 R + = A [];
(GDB)P I
$ 1 = 5



通过选项 -fsanitize =地址

这是一个报价:

 该工具可以检测到以下类型的错误:*超出边界访问堆,栈和全局
*释放后使用,
*释放后使用收益率(一定程度上)
*双免费,免费无效
*内存泄漏(实验)

铛-fno-省略帧指针-fsanitize =地址-g out_boundary.c -o out_boundary_clang

和则:

  $ ./out_boundary_clang 6 2  - ;&放大器; 1 | asan_symbolize.py
================================================== ===============
== == 9634 ERROR:AddressSanitizer:堆栈缓冲区溢出在PC上的地址0x7fff91bb2ad4 0x459c67 BP 0x7fff91bb2910 SP 0x7fff91bb2908
在0x7fff91bb2ad4线程T0读取大小4
    在主out_boundary.c#0 0x459c66:12
    #1 0x3a1d81ed1c在__libc_start_main ??:0
    #2 0x4594ac在_start ??:0
地址0x7fff91bb2ad4位于线程T0的堆栈帧偏移244
    在主out_boundary.c#0 0x45957f:6
  这个框架有8个目标(S):
    [32,36)'
    [96,100)'
    [160,168)''
    [224,244)'A'
    [288,292)'我'
    [352,356)'R'
    [416,420)结束
    [480,484)I1
提示:这可能是一个误报,如果你的程序使用了自定义堆栈展开的机制或swapcontext
      (longjmp的和C ++异常* *都支持)
暗影字节各地的车地址:
  0x10007236e500:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e510:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e520:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e530:00 00 00 00 00 00 00 00 00 00 00 00 F1 F1 F1 F1
  0x10007236e540:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F2 F2 F2 F2
= GT; 0x10007236e550:00 F4 F4 F4 F2 F2 F2 F2 00 00 [04] F4 F2 F2 F2 F2
  0x10007236e560:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F2 F2 F2 F2
  0x10007236e570:04 F4 F4 F4 F2 F2 F2 F2 04 F4 F4 F4 F3 F3 F3 F3
  0x10007236e580:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e590:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e5a0:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
影传说字节(一个字节的阴影重新presents 8应用字节):
  寻址:00
  部分寻址:01 02 03 04 05 06 07
  堆左redzone后面:发
  堆右redzone后面:FB
  释放堆区:FD
  堆栈留下redzone后面:F1
  堆栈中旬redzone后面:F2
  堆栈右redzone后面:F3
  堆栈部分redzone后面:F4
  返回后堆栈:F5
  F8:在范围堆栈使用
  全球redzone后面:F9
  全局初始化命令:F6
  用户中毒:F7
  阿三内部:FE
== == 9634中止

或者,您可以同时使用此选项。相关链接:

Consider writing implementation for some not-so-obvious algorithm in C. For example let it be recursive quicksort, that I have found in K. N. King's "C Programming: A Modern Approach, 2nd Edition" book, that it's available from here. The most interesting part consist of two following definitions:

void quicksort(int a[], int low, int high)
{
    int middle;

    if (low >= high)
        return;

    middle = split(a, low, high);
    quicksort(a, low, middle - 1);
    quicksort(a, middle + 1, high);
}

int split(int a[], int low, int high)
{
    int part_element = a[low];

    for (;;) {
       while (low < high && part_element <= a[high])
           high--;
       if (low >= high)
           break;
       a[low++] = a[high];

       while (low < high && a[low] <= part_element)
           low++;
       if (low >= high)
           break;
       a[high--] = a[low];
    }

    a[high] = part_element;
    return high;
}

Both while loops can be optimized by removing low < high tests:

for (;;) {
    while (part_element < a[high])
        high--;
    if (low >= high)
        break;
    a[low++] = a[high];
    a[high] = part_element;

    while (a[low] <= part_element)
        low++;
    if (low >= high)
        break;
    a[high--] = a[low];
    a[low] = part_element;
}

What is the recommended way to make sure that every access or write to array (allocated on stack) is actually valid (i.e. not provoking undefined behaviour) ? What I already tried is to:

  • manually debug with gdb on some actual data
  • pass source code to static analysis tools like split or cppcheck
  • valgrind with --tool=exp-sgcheck switch

For example having five elements array {8, 1, 2, 3, 4}:

#define N 5

int main(void)
{
    int a[N] = {8, 1, 2, 3, 4}, i;

    quicksort(a, 0, N - 1);

    printf("After sort:");
    for (i = 0; i < N; i++)
        printf(" %d", a[i]);
    putchar('\n');

    return 0;
}

Result is (most certainly it's implemention dependent):

After sort: 1 1 2 4 8

1. GDB

(gdb) p low
$1 = 3
(gdb) p high
$2 = 4
(gdb) p a[low]
$3 = 1
(gdb) p part_element
$4 = 8
(gdb) s
47              low++;
(gdb) s
46          while (a[low] <= part_element)
(gdb) s
47              low++;
(gdb) s
46          while (a[low] <= part_element)
(gdb) p low
$5 = 5
(gdb) p high
$6 = 4
(gdb) bt full
#0  split (a=0x7fffffffe140, low=5, high=4) at qsort.c:46
        part_element = 8
#1  0x00000000004005df in quicksort (a=0x7fffffffe140, low=0, high=4) at qsort.c:30
        middle = <value optimized out>
#2  0x0000000000400656 in main () at qsort.c:14
        a = {4, 1, 2, 1, 8}
        i = <value optimized out>

As you see low variable went outside boundary:

(gdb) p low
$5 = 5

2. Static analysis tools

$ splint -retvalint -exportlocal qsort.c 
Splint 3.1.2 --- 07 Feb 2011

Finished checking --- no warnings

$ cppcheck qsort.c 
Checking qsort.c...

3. Valgrind with --tool=exp-sgcheck

$ valgrind --tool=exp-sgcheck ./a.out 
==5480== exp-sgcheck, a stack and global array overrun detector
==5480== NOTE: This is an Experimental-Class Valgrind Tool
==5480== Copyright (C) 2003-2012, and GNU GPL'd, by OpenWorks Ltd et al.
==5480== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==5480== Command: ./a.out
==5480== 
==5480== Invalid read of size 4
==5480==    at 0x4005A0: split (qsort.c:46)
==5480==    by 0x4005DE: quicksort (qsort.c:30)
==5480==    by 0x400655: main (qsort.c:14)
==5480==  Address 0x7ff000114 expected vs actual:
==5480==  Expected: stack array "a" of size 20 in frame 2 back from here
==5480==  Actual:   unknown
==5480==  Actual:   is 0 after Expected
==5480== 
After sort: 1 1 2 4 8
==5480== 
==5480== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The location at 0x4005A0: split (qsort.c:46) is matching to the same place as I found by gdb manually.

解决方案

What is the recommended way to make sure that every access or write to array (allocated on stack) is actually valid (i.e. not provoking undefined behaviour) ?

What if use clang on Linux with the options -fsanitize=addressand -fsanitize=undefined? It is also available in gcc: http://gcc.gnu.org/gcc-4.8/changes.html.


clang with the option -fsanitize=undefined

This is an example:

#include <stdlib.h>

#define N 5

int main(int argc, char *argv[])
{
  int a[N] = {8, 1, 2, 3, 4}, i;

  int r =0;
  int end = atoi(argv[1]);
  for (int i = 0; i != end; ++i)
    r += a[i];

  return r;
}

Then

clang -fno-omit-frame-pointer -fsanitize=undefined -g out_boundary.c -o out_boundary_clang

$ ./out_boundary_clang 5
$ ./out_boundary_clang 6
out_boundary.c:12:10: runtime error: index 5 out of bounds for type 'int [5]'
Illegal instruction (core dumped)

And then analyze a core file

Program terminated with signal 4, Illegal instruction.
#0  main (argc=2, argv=0x7fff3a1c28c8) at out_boundary.c:12
12          r += a[i];
(gdb) p i
$1 = 5


clang with the option -fsanitize=address

This is a quote:

The tool can detect the following types of bugs:

* Out-of-bounds accesses to heap, stack and globals
* Use-after-free
* Use-after-return (to some extent)
* Double-free, invalid free
* Memory leaks (experimental)

clang -fno-omit-frame-pointer -fsanitize=address -g out_boundary.c -o out_boundary_clang

And then:

$ ./out_boundary_clang 6 2>&1 | asan_symbolize.py
=================================================================
==9634==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff91bb2ad4 at pc 0x459c67 bp 0x7fff91bb2910 sp 0x7fff91bb2908
READ of size 4 at 0x7fff91bb2ad4 thread T0
    #0 0x459c66 in main out_boundary.c:12
    #1 0x3a1d81ed1c in __libc_start_main ??:0
    #2 0x4594ac in _start ??:0
Address 0x7fff91bb2ad4 is located in stack of thread T0 at offset 244 in frame
    #0 0x45957f in main out_boundary.c:6
  This frame has 8 object(s):
    [32, 36) ''
    [96, 100) ''
    [160, 168) ''
    [224, 244) 'a'
    [288, 292) 'i'
    [352, 356) 'r'
    [416, 420) 'end'
    [480, 484) 'i1'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Shadow bytes around the buggy address:
  0x10007236e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e530: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10007236e540: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2
=>0x10007236e550: 00 f4 f4 f4 f2 f2 f2 f2 00 00[04]f4 f2 f2 f2 f2
  0x10007236e560: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2
  0x10007236e570: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f3 f3 f3 f3
  0x10007236e580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:     fa
  Heap right redzone:    fb
  Freed heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==9634==ABORTING

Or you can use both this options. Useful links:

这篇关于推荐的方式来跟踪数组中的C程序外绑定访问/写的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆