如何禁用可能的堆栈溢出保护(EIP没有被覆盖,EBP是) [英] How to disable possible stack smashing protection (EIP is not being overwritten, EBP is)
问题描述
我试图找出如何粉碎藏匿进行一步一步来。我已经使用谷歌没有用,我还是不知道为什么我的EIP没有被覆盖。我有这样的例子程序:
1的#include<&stdio.h中GT;
2#包括LT&;&string.h中GT;
3
4 INT主(INT ARGC,CHAR *的argv [])
5 {
6 CHAR BUF [10];
7
8的strcpy(BUF,ARGV [1]);
9的printf(DONE \\ n);
10返回0;
11
12}
它编译
的gcc -o -g前卫的main.c
当我把很多AAAAAA的我得到SEGV和寄存器EBP(也argc和argv地址重写:
计划接收信号SIGSEGV,分割过错。
0x08048472主(ARGC =<错误读数变量:在地址0x41414141不能访问存储器>中的argv =<错误读数变量:在地址0x41414145&GT不能访问内存;)
在main.c中:12
12}
(GDB)信息章
EAX为0x0 0
ECX 0x41414141 1094795585
EDX 0xb7fbb878 -1208240008
EBX 0xb7fba000 -1208246272
ESP 0x4141413d 0x4141413d
EBP 0x41414141 0x41414141
ESI为0x0 0
EDI为0x0 0
EIP 0x8048472 0x8048472<主+ 71>
EFLAGS 0x10282 [IF SF RF]
CS 0x73 115
SS 0x7b 123
DS 0x7b 123
ES 0x7b 123
FS为0x0 0
GS 0x33 51
我以为EIP略低于EBP,但它仍然有来自主要功能的地址。下面是主要的拆卸:
(GDB)disass主
汇编code的转储为主要功能:
0x0804842b 1 + 0计算值:LEA为0x4(%尤),%ecx中
0x0804842f 1 + 4计算值:和$ 0xfffffff0,%尤
0x08048432 1 + 7计算值:pushl -0x4(%ECX)
0x08048435 1 + 10 -10 ;:推%EBP
0x08048436 1 + 11计算值:MOV%ESP,EBP%
0x08048438 1 + 13计算值:推%ecx中
0x08048439 1 + 14计算值:子$ 0x14的,%尤
0x0804843c 1 + 17计算值:MOV%ecx中,%eax中
0x0804843e 1 + 19计算值:MOV为0x4(%eax中),%eax中
0x08048441 1 + 22:加$为0x4,%eax中
0x08048444 1 + 25计算值:MOV(%EAX),EAX%
0x08048446 1 + 27计算值:子$ 0x8中,ESP%
0x08048449 1 + 30计算值:推%eax中
0x0804844a 1 + 31计算值:LEA -0x12(EBP%),%EAX
0x0804844d 1 + 34计算值:推%eax中
0x0804844e 1 + 35计算值:调用0x80482f0&下; strcpy的@ PLT>
0x08048453 1 + 40计算值:加$ 0×10,ESP%
0x08048456 1 + 43计算值:子$位于0xC,%尤
0x08048459 1 + 46计算值:推$ 0x8048510
0x0804845e 1 + 51计算值:调用0x8048300&下;把@ PLT>
0x08048463 1 + 56计算值:加$ 0×10,ESP%
0x08048466 1 + 59计算值:MOV $为0x0,%eax中
0x0804846b 1 + 64计算值:MOV -0x4(%EBP),%ecx中
0x0804846e 1 + 67计算值:离开
0x0804846f 1 + 68 ;: LEA -0x4(%ECX),%尤
= GT; 0x08048472 1 + 71计算值:保留
汇编转储结束。
现在我在搞清楚汇编指令一个接一个的过程,但我没有看到EIP是刚过的strcpy 加载从堆栈返回地址的那一刻code>结束。我试过
-fno-堆栈保护
但它并没有改变任何事情。可能是什么原因呢?
编辑:
OK,我会尽力去在一步一步来,请大家指正,我错了。
#略低于SP是argc和argv,并且SP指向的地址
#,其中RET将存储
#这其中的argc的地址(这是在栈上)移动到$ ECX
0x0804842b 1 + 0计算值:LEA为0x4(%尤),%ecx中
#移动堆栈指针向下对齐
0x0804842f 1 + 4计算值:和$ 0xfffffff0,%尤
#推前调整到$ SP指着值
#这是从来没有使用过 - 纠正我,如果我错了
0x08048432 1 + 7计算值:pushl -0x4(%ECX)
#按上次使用的基指针值(并开始创建另一帧)
0x08048435 1 + 10 -10 ;:推%EBP
#设置当前位置SP作为BP - 我认为这里的主体开始
0x08048436 1 + 11计算值:MOV%ESP,EBP%
#推ARGC的地址 - 这是后来用于计算
#ARGV的地址[1]。
0x08048438 1 + 13计算值:推%ecx中
#使堆栈上的一些空间(20字节 - 5个字 - 前两个我
#确定什么(对齐和这里不使用返回值?)
#另一个3缓冲液[10]
0x08048439 1 + 14计算值:子$ 0x14的,%尤
#移动ARGC地址$ EAX
0x0804843c 1 + 17计算值:MOV%ecx中,%eax中
#移动ARGV地址$ EAX
0x0804843e 1 + 19计算值:MOV为0x4(%eax中),%eax中
#搬过去的argv - $ EAX现在应该指向指针先
#参数字符串
0x08048441 1 + 22:加$为0x4,%eax中
#移动参数字符串为$ EAX地址
0x08048444 1 + 25计算值:MOV(%EAX),EAX%
#腾出空间为2个字
#(从strcpy的大概路线和返回值)
0x08048446 1 + 27计算值:子$ 0x8中,ESP%
#推参数地址
0x08048449 1 + 30计算值:推%eax中
#获取本地缓冲区的地址
0x0804844a 1 + 31计算值:LEA -0x12(EBP%),%EAX
# 推它
0x0804844d 1 + 34计算值:推%eax中
#呼叫的strcpy
0x0804844e 1 + 35计算值:调用0x80482f0&下; strcpy的@ PLT>
退货+ 2排列对参数和2 - #删除4个字
0x08048453 1 + 40计算值:加$ 0×10,ESP%
#建立3个字的空间 - 对齐+返回值
0x08048456 1 + 43计算值:子$位于0xC,%尤
#推printf的参数地址(字符串地址)
0x08048459 1 + 46计算值:推$ 0x8048510
#调用printf
0x0804845e 1 + 51计算值:调用0x8048300&下;把@ PLT>
参数和previous 3 1 - #删除4个字
0x08048463 1 + 56计算值:加$ 0×10,ESP%
#重置为0x0只是因为
0x08048466 1 + 59计算值:MOV $为0x0,%eax中
#装载previously保存的argc的地址
0x0804846b 1 + 64计算值:MOV -0x4(%EBP),%ecx中
#不知道这一点离开......
0x0804846e 1 + 67计算值:离开
#$重新加载ESP起始值
0x0804846f 1 + 68 ;: LEA -0x4(%ECX),%尤
#弹出RET地址 - 这一项应改为
#指向恶意code
= GT; 0x08048472 1 + 71计算值:保留
- 是线+7值不必要的?我没有看到任何使用它,那么为什么它存储在哪里?
- 在某些地方SP移动超过它有 - 是由于对齐? (例如线+14)
- 我的结论了线+71是否正确?
免责声明:我使用安装的GnuWin32 Windows 7的系统上GCC-4.8.3。 Windows没有出现有默认启用ASLR,所以我得到重现的内存地址,当我运行这个程序,这使得生活更容易一点。另外,如果你按照你得到的将这个内存地址,在所有的可能性,是不同的。
现在考虑这个程序:
的#include<&string.h中GT;无效copyinput(字符*输入)
{
焦炭BUF [10];
的strcpy(buf中,输入);
}INT主(INT ARGC,字符** argv的)
{
INT 1 = 5;
copyinput(ARGV [1]);
一= 7; 返回0;
}
,我们可以用这个命令行编译:
gcc的-g -ansi -pedantic -Wall overflow2.c -o溢出
然后运行gdb下程序。
我们发生在主`一个破发点,并设置命令行参数AAAAAAAAAABBBBBBBBBBCCCCCCCCCC,并注意以下几点:
-
首先注意到的主要拆卸:
0x0040157a 1 + 0计算值:推%EBP
0x0040157b 1 + 1&GT ;: MOV%ESP,EBP%
= GT; 0x0040157d 1 + 3计算值:和$ 0xfffffff0,%尤
0x00401580 1 + 6个;:子$为0x20,%尤
0x00401583 1 + 9计算值:调用0x401fd0&下; __主>
0x00401588 1 + 14计算值:MOVL $ 0x5,0x1c(%尤)
0x00401590 1 + 22:MOV位于0xC(EBP%),%EAX
0x00401593 1 + 25计算值:加$为0x4,%eax中
0x00401596 1 + 28计算值:MOV(%EAX),EAX%
0x00401598 1 + 30计算值:MOV%eax中,(%尤)
0x0040159b 1 + 33计算值:调用0x401560&所述; copyinput>
0x004015a0 1 + 38计算值:MOVL $ 0x7,0x1c(%尤)
0x004015a8 1 + 46计算值:MOV $为0x0,%eax中
0x004015ad 1 + 51计算值:离开
0x004015ae 1 + 52计算值:保留
0x004015af 1 + 53计算值:NOP我们感兴趣的是这里是下一个地址
指令后,我们称之为copyinput
。这将是的值
的 EIP 的是被当控制流程传递到压入堆栈copyinput
。 -
让我们来看一下寄存器:
(GDB)信息章
EAX为0x1 1
ECX 0x752c1162 1965822306
EDX 0xa02080 10494080
EBX 0X2 2
ESP 0x28fea0 0x28fea0
EBP 0x28fec8 0x28fec8
ESI 0xa01858 10491992
EDI 0x1F的31
EIP 0x401590 0x401590<主+ 22为氢。
EFLAGS 0x202 [IF]
CS 0x23 35
SS 0x2B访问43
DS 0x2B访问43
ES 0x2B访问43
FS 0x53 83
GS 0x2B访问43我们是从上面有兴趣的 ESP 的和的 EBP 的。记住的 EBP 的
应也得到了函数调用期间被压入堆栈copyinput
。 -
单步的调用
copyinput
再踩成
功能。在这一点上,看一下寄存器(调用之前的strcpy
再次):(GDB)信息章
EAX 0x9218b0 9574576
ECX 0x752c1162 1965822306
EDX 0x922080 9576576
EBX 0X2 2
ESP 0x28fe70 0x28fe70
EBP 0x28fe98 0x28fe98
ESI 0x921858 9574488
EDI 0x1F的31
EIP 0x401566 0x401566&下; copyinput + 6个
EFLAGS 0x202 [IF]
CS 0x23 35
SS 0x2B访问43
DS 0x2B访问43
ES 0x2B访问43
FS 0x53 83
GS 0x2B访问43我们可以在这里看到的是,copyinput 为
堆栈帧是
0x28fe70到0x28fe98,且参考点(2),我们可以看到
对于主
设在0x28fec8。 堆栈帧 -
我们可以检查从0x28fe70到0x28fec8堆叠(总共88
字节)是这样的:(GDB)X / 88xb 0x28fe70 0x28fe70:为0x50为0x15 0X40 0×00 0×00的0xDC 0×00 0×00
0x28fe78:为0xFF 0xFF的0xFF的0xFF的的0x30 0x60的0x44进行为0x00
0x28fe80:0×03 0×00 0×00 0×00 0x8c 0xFE的0x28为0x00
0x28fe88:为0x00为0x00为0x00为0x00 0x8f 0x17已0X40为0x00
0x28fe90:为0x50 0x1F的0X40为0x00为0x1c为0x50 0X40为0x00
0x28fe98:0xc8 0xFE的0x28为0x00 0XA0为0x15 0X40为0x00
0x28fea0:0XB0为0x18 0x92为0x00为0x00为0x50 0X40为0x00
0x28fea8:均为0x88 0xFF的0x28为0x00 0xae 0x1F的0X40为0x00
0x28feb0:为0x50 0x1F的0X40为0x00 0x60的为0x00为0x00 0X40
0x28feb8:0x1F的0×00 0×00 0×00 0×05 0×00 0×00 0×00
0x28fec0:0x17已将0x58 0x92为0x00 0x1F的为0x00为0x00为0x00原始内存转储不是很容易阅读,所以让我们崩溃了
字节到字,字节顺序转换为大端,我们
能看到某些值位于:0x28fe70:0x00401550< - 尤指`copyinput`
0x000000dc
0x28fe78:为0xffffffff
0x00446030
0x28fe80:0x00000003
0x0028fe8c
0x28fe88:00000000
0x0040178f
0x28fe90:0x00401f50
0x0040501c
0x28fe98:0x0028fec8< - 存储* EBP *为`main``s栈帧
0x004015a0< - 存储* EIP *
0x28fea0:0x009218b0< - 尤指`main``s栈帧
0x00405000所以从这个可以看出,存储的 EIP 的位于
堆栈地址0x28fe9C。从这里就可以看到的 EIP 的被压入堆栈第一则的 EBP 的被推入堆栈。 -
现在单步执行,直到呼叫字符串复制和检查后,
内存再次说明:(GDB)X / 88xb 0x28fe70
0x28fe70:0x86可以0xFE的0x28为0x00 0XB0为0x18 0x92为0x00
0x28fe78:为0xFF 0xFF的0xFF的0xFF的的0x30 0x60的0x44进行为0x00
0x28fe80:0×03 0×00 0×00 0×00 0x8c 0xFE的×41×41
0x28fe88:×41×41×41×41×41×41×41×41
0x28fe90:的0x42的0x42的0x42的0x42的0x42的0x42的0x42的0x42
0x28fe98:的0x42的0x42 0x43中0×43×43×43×43×43
0x28fea0:0×43×43×43×43为0x00为0x50 0X40为0x00
0x28fea8:均为0x88 0xFF的0x28为0x00 0xae 0x1F的0X40为0x00
0x28feb0:为0x50 0x1F的0X40为0x00 0x60的为0x00为0x00 0X40
0x28feb8:0x1F的0×00 0×00 0×00 0×05 0×00 0×00 0×00
0x28fec0:0x17已将0x58 0x92为0x00 0x1F的为0x00为0x00为0x00,我们可以看到,无论的 EBP 的存储值和 EIP 的已
惨败在堆栈中。现在,当我们从返回copyinput
这
将弹出的的 EIP 的值(也就是现在0x43434343)和 EBP 的(这
现在0x43434242)从堆栈并尝试执行
在0x43434343指令;这显然会产生一个
例外。
这样的堆攻击的主要目标将是安排一下,让我们覆盖的 EIP 的与我们选择的有效值。例如,考虑下面的程序:
的#include<&stdio.h中GT;
#包括LT&;&string.h中GT;无效copyinput(字符*输入)
{
焦炭BUF [10];
的strcpy(buf中,输入);
}无效testinput()
{
的printf(我们从来没有看到这个\\ n);
}INT主(INT ARGC,字符** argv的)
{
INT 1 = 5;
copyinput(ARGV [1]);
一= 7; 返回0;
}
功能 testinput
永远不会被调用。但是,如果我们可以0x0040157a值覆盖 copyinput
返回地址(即 testinput
的位置上我机),我们将能够导致该函数来执行。
=============================================== ==================================
在意见中提出的问题的答案:
不知道OS /编译器,你使用的是什么。我使用了在Windows 7箱GCC-4.8.3编译它的示例程序。我的主拆卸看起来是这样的:
(GDB)disass主
汇编code的转储为主要功能:
0x00401560 1 + 0计算值:推%EBP
0x00401561 1 + 1&GT ;: MOV%ESP,EBP%
0x00401563 1 + 3计算值:和$ 0xfffffff0,%尤
0x00401566 1 + 6个;:子$为0x20,%尤
0x00401569 1 + 9计算值:调用0x401fc0&下; __主>
这是主要的,我们正在建立主堆栈帧的preamble。我们推previous堆栈帧的基指针(由运行时库提供了一些功能),然后将基指针到堆栈点。接下来用我们调整 ESP 的,使其均匀地被16整除的,然后我们减去32字节(0x20的)的 ESP 的(记住,栈向下增长,所以我们现在有一些空间主要将使用。
推%EBP
, MOV%ESP,%EBP
然后子XXX,ESP%
是函数的共同preamble。
让我们试着寻找到的东西都位于内存中,我们应。在GDB我们可以做到以下几点:
(GDB)X / 16xb&放大器;的argv [0]
0xa31830:将0x58为0x18 0xA3执行0x98在全局为0x00为0x18 0xA3执行为0x00
0xa31838:0×00 0×00 0×00 0×00是0xAB是0xAB是0xAB是0xAB
这是我们所期望的,2个32位指针后跟一个空终止符。所以ARGV [0]位于0x00a31858和argv 1 位于0x00a31898;这可以通过在这两个位置检查MEMOR可以看出:
(GDB)X / 20cb 0x00a31858
0xa31858:100'D'58':'92'\\\\'117'U'115'S'101'E'114'R'115'S'
0xa31860:92'\\\\'103'G'104'H'117'U'98'B'101'E'114'R'92'\\\\'
0xa31868:71'G'78'N'85'U'72'H'
(GDB)X / 20xb 0x00a31898
0xa31898:×41×41×41×41×41×41×41×41
0xa318a0:×41×41 0×00是0xAB是0xAB是0xAB是0xAB是0xAB
0xa318a8:是0xAB是0xAB是0xAB 0xFE的
我们可以找出我们的缓冲区位于但是这样做在GDB以下内容:
(GDB)打印$ ESP
$ 4 =(无效*)0x28fea0
(GDB)打印$ EBP
$ 5 =(无效*)0x28fec8
(GDB)X / 40xb $ ESP
0x28fea0:0xb6 0xFE的0x28 0x98在全局为0x00为0x18 0xA3执行为0x00
0x28fea8:均为0x88 0xFF的0x28为0x00 0x9e 0x1F的0X40为0x00
0x28feb0:0X40 0x1F的0X40为0x00 0x60的0×00×41×41
0x28feb8:×41×41×41×41×41×41×41×41
0x28fec0:为0x00 0x17已0xA3执行为0x00 0x0B中为0x00为0x00为0x00
因此,我们可以看到,我们的缓冲区开始于0x28feb6
现在,我们有出路,让长相我们code的下一个部分,它应该在调用设置为的strcpy
:
0x0040156e 1 + 14计算值:MOV位于0xC(EBP%),%EAX
0x00401571 1 + 17计算值:加$为0x4,%eax中
0x00401574 1 + 20计算值:MOV(%EAX),EAX%
0x00401576 1 + 22:MOV EAX%,为0x4(%ESP)
0x0040157a 1 + 26计算值:LEA 0x16(%ESP),EAX%
0x0040157e 1 + 30计算值:MOV%eax中,(%尤)
0x00401581 1 + 33计算值:调用0x402748&所述; strcpy的>
作为提醒,在AT& T公司的汇编语法的地址操作数是这样的:
位移(基址寄存器,偏移寄存器,标量倍增)
这相当于在intel语法:
[基址寄存器+位移+偏置寄存器*标量倍增]
因此,既然,
0x0040156e 1 + 14计算值:MOV位于0xC(EBP%),%EAX
0x00401571 1 + 17计算值:加$为0x4,%eax中
0x00401574 1 + 20计算值:MOV(%EAX),EAX%
0x00401576 1 + 22:MOV EAX%,为0x4(%ESP)
我们增加0x0C到我们目前的基指针这给0x28FED4的值,然后我们复制的内容包含在内存地址的 EAX 的。通过使用GDB,我们可以发现,位于 0x08FEC4
的四个字节是 0x00a31830
这是ARGV [地址0]。添加四的 EAX 的原因 EAX 的现在要指向的argv的 1 。接下来的两个指令有效的argv 1 的地址移到上面的 ESP 的。
0x0040157a 1 + 26计算值:LEA 0x16(%ESP),EAX%
0x0040157e 1 + 30计算值:MOV%eax中,(%尤)
沿着持续,我们增加的 ESP 的通过0x16(这给了我们0x28FEB6,我们有previously确定为其中 BUF [10] $ C $ 。C>位于然后我们移动这个值到的 ESP 的是在这个时候,我们的堆栈现在看起来像:
~~
| |
+ ------------ +
0x28fea4 | 0x00a31898 |记住,这是ARGV的地址[1] [0]
+ ------------ +
0x28fea0 | 0x0028feb6 |记住,这是BUF的地址[0]
+ ------------ +
这是合理的考虑到的函数原型为的strcpy
是:
的char *的strcpy(字符* DST,为const char * SRC);
和通常,参数被压入堆栈从右到左,因此我们认为,该的src
大干快上推,然后再 DST
将被推在第二位。因此,而不是只需按下参数压入堆栈,编译器预留了足够的空间,以便它可以在正确的位置加载所需的值。所以,一切都在地方,我们就可以调用的strcpy
。
接下来的几个指令刚刚成立的printf 调用(当然实际上
看跌
),我们需要移动字符串的地址DONE \\ n到堆栈中,然后调用看跌
:
0x00401586 1 + 38计算值:MOVL $ 0x404024(%ESP)
0x0040158d 1 + 45计算值:调用0x402750&所述;看跌>
最后,我们把返回值放入的 EAX 的(这是通常包含一个函数的返回值寄存器),然后我们退出主
。
0x00401592 1 + 50计算值:MOV $为0x0,%eax中
0x00401597 1 + 55计算值:离开
0x00401598 1 + 56计算值:保留
不知道如果我回答你所有的问题,但我想我做到了。此外,我希望我没有搞砸了分析太多,我通常不这样做组装的深入分析,或使用AT&安培; T语法
=============== EDIT2 =============================== ====
三其余的问题:
是符合价值+7不必要的?我没有看到它的任何使用,所以
为什么它存储在哪里?
块引用>您的分析,我们正在推动的 ESP 的原始,不对齐值显示正确。我的预感是,在您拆解,我们的前几行
在寻找特殊的初创code主。记得有之前的创建主堆栈帧的堆栈上的堆栈帧。你可能想看看此链接看到什么正常的开始 - 了Linux下一个节目的顺序。我的直觉是,我们需要preserve的 ESP 的未经修改的值,这样我们就可以更早的堆栈帧还原到正确的位置。
在某些地方SP移动超过它有 - 是由于对齐?
(例如线+14)
块引用>我会做的分析,这些线路在哪里,我们实际上是设置主堆栈帧。在
主+ 14
我们减去20个字节的 ESP 的,所以我们是我们主要的功能分配使用20个字节。我们可以说,这些字节12是由我们的缓冲区使用(请记住,有可能会被填充的两个字节在我们缓冲区的末尾,使存储在堆栈中的下一个值将在一个32位字边界)。0x08048435 1 + 10 -10 ;:推%EBP
0x08048436 1 + 11计算值:MOV%ESP,EBP%
0x08048438 1 + 13计算值:推%ecx中
0x08048439 1 + 14计算值:子$ 0x14的,%尤所以,我会声称
主+ 10
到主+ 14
都是正常功能序言
我的结论了线+71是否正确?
块引用>是的。在这一点上,我们需要已覆盖存储的 EIP 的堆栈上,这将导致RET指令阅读我们的价值。 RET指令的以下描述从取自此处 (其实本页面的信息,装配它很值得一读一个很好的协议,唯一的缺点是,该页面采用了英特尔的语法和你一直presenting AT&安培; T语法)
CALL,RET - 子程序调用和返回
这些指令实现子程序调用和返回。呼叫
指令首次将当前code的位置上
在内存支持的硬件堆栈(见推指令
细节),然后执行一个无条件跳转到code位置
由标签操作数指示。不同于简单的跳转指令,
调用指令保存的位置返回时,
子程序完成。
RET指令实现了一个子程序返回机制。该指令>首先弹出一个code的位置关支持的硬件
内存堆栈(详见弹出指令)。然后,它
执行无条件跳转到检索code的位置。语法
调用<标签>
RET
块引用>在LEAVE指令的附加信息(从的主+ 67 使用) file_module_x86_id_154.html相对=nofollow>这里):
释放由先前的键指令设置堆栈帧。该
LEAVE指令将帧指针(在EBP寄存器)成
堆栈指针寄存器(ESP),该释放堆栈空间
分配到堆栈帧。旧的帧指针(帧指针
为保存按ENTER指令调用的程序)是
然后从堆栈中弹出到EBP寄存器,恢复
在调用程序的堆栈帧。
一个RET指令通常执行LEAVE指令后
程序控制返回到调用过程。
请参见第6章程序呼吁块结构的语言
在IA-32英特尔体系结构软件开发人员手册,第1卷,
对使用的进入和离开的详细信息
说明。
块引用>N.B。的,可以改变由被GDB发射的拆卸的味道
使用以下命令:一套拆卸味ATT
一套拆卸味英特尔
显示拆卸味第三个命令显示当前的味道是什么。
的 PS 的我第二个小丑在下面他的回答发表评论。移动实际的弱势code开了一个函数,而随后在主会使得分析变得更容易,你不必处理的主要取向和独特的序言和结语主的怪事。一旦你得到这种类型的栈开发的手柄可以再回去,并通过那里的漏洞是主要的一个例子工作。
的 PPS 的工作Linux系统,你也可能会碰到ASLR的问题上,在每次运行一个程序的东西是不同的存储单元,所以堆栈帧和栈帧位置之间的偏移会更改。您可以使用下面的短节目(从壳牌codeR手册采取:由克里斯·安莱发现和利用安全漏洞的 et.al 的),看看是否ASLR是一个问题。
的#include<&stdio.h中GT;
无符号长find_start(无效)
{
__asm __(MOVL%ESP,%eax中);
} 诠释的main()
{
的printf(0X%X \\ n,find_start());
返回(0);
}运行程序几次,如果输出的不同有ASLR运行一些版本。它会使你的生活更加困难,但并非无法克服。
I'm trying to figure out how stash smashing is carried out step by step. I have already used Google to no avail, I still don't know why my EIP is not being overwritten. I have this example program:
1 #include <stdio.h> 2 #include <string.h> 3 4 int main(int argc, char *argv[]) 5 { 6 char buf[10]; 7 8 strcpy(buf, argv[1]); 9 printf("Done.\n"); 10 return 0; 11 12 }
It's compiled with
gcc -g -o prog main.c
When I put a lot of AAAAAA's I get SEGV and the register EBP (and also argc and argv addresses are overwritten:
Program received signal SIGSEGV, Segmentation fault. 0x08048472 in main (argc=<error reading variable: Cannot access memory at address 0x41414141>, argv=<error reading variable: Cannot access memory at address 0x41414145>) at main.c:12 12 } (gdb) info reg eax 0x0 0 ecx 0x41414141 1094795585 edx 0xb7fbb878 -1208240008 ebx 0xb7fba000 -1208246272 esp 0x4141413d 0x4141413d ebp 0x41414141 0x41414141 esi 0x0 0 edi 0x0 0 eip 0x8048472 0x8048472 <main+71> eflags 0x10282 [ SF IF RF ] cs 0x73 115 ss 0x7b 123 ds 0x7b 123 es 0x7b 123 fs 0x0 0 gs 0x33 51
I thought that EIP is just below EBP, but it still has the address from main function. Here's the disassembly of main:
(gdb) disass main Dump of assembler code for function main: 0x0804842b <+0>: lea 0x4(%esp),%ecx 0x0804842f <+4>: and $0xfffffff0,%esp 0x08048432 <+7>: pushl -0x4(%ecx) 0x08048435 <+10>: push %ebp 0x08048436 <+11>: mov %esp,%ebp 0x08048438 <+13>: push %ecx 0x08048439 <+14>: sub $0x14,%esp 0x0804843c <+17>: mov %ecx,%eax 0x0804843e <+19>: mov 0x4(%eax),%eax 0x08048441 <+22>: add $0x4,%eax 0x08048444 <+25>: mov (%eax),%eax 0x08048446 <+27>: sub $0x8,%esp 0x08048449 <+30>: push %eax 0x0804844a <+31>: lea -0x12(%ebp),%eax 0x0804844d <+34>: push %eax 0x0804844e <+35>: call 0x80482f0 <strcpy@plt> 0x08048453 <+40>: add $0x10,%esp 0x08048456 <+43>: sub $0xc,%esp 0x08048459 <+46>: push $0x8048510 0x0804845e <+51>: call 0x8048300 <puts@plt> 0x08048463 <+56>: add $0x10,%esp 0x08048466 <+59>: mov $0x0,%eax 0x0804846b <+64>: mov -0x4(%ebp),%ecx 0x0804846e <+67>: leave 0x0804846f <+68>: lea -0x4(%ecx),%esp => 0x08048472 <+71>: ret End of assembler dump.
Now I'm in the process of figuring out the assembler instructions one by one, but I don't see the moment where EIP is loaded with a return address from the stack just after
strcpy
finishes. I tried the-fno-stack-protector
but it didn't change a thing. What could be the reason for this?EDIT:
OK, I'll try to go over it step by step, please correct me where I'm wrong
# Just below the sp are argc and argv and the sp points to the address # where RET will be stored # This one moves the address of argc (which is on the stack) to $ecx 0x0804842b <+0>: lea 0x4(%esp),%ecx # Move stack pointer down for alignment 0x0804842f <+4>: and $0xfffffff0,%esp # Push the value to which $sp pointed to before alignment # It is never used - correct me if I'm wrong 0x08048432 <+7>: pushl -0x4(%ecx) # Push last used base pointer value (and start creating another frame) 0x08048435 <+10>: push %ebp # Set current position sp as bp - I think here the main body starts 0x08048436 <+11>: mov %esp,%ebp # Push the address of argc - it's later used for calculating # the address of argv[1]. 0x08048438 <+13>: push %ecx # Make some space on the stack (20 bytes - 5 words - first two I'm # sure for what (alignment and not used here return value?) # another 3 for buffer[10] 0x08048439 <+14>: sub $0x14,%esp # Move argc address to $eax 0x0804843c <+17>: mov %ecx,%eax # Move argv address to $eax 0x0804843e <+19>: mov 0x4(%eax),%eax # Move past argv - $eax should now point to pointer to first # argument string 0x08048441 <+22>: add $0x4,%eax # Move the address of the parameter string to $eax 0x08048444 <+25>: mov (%eax),%eax # Make space for 2 words # (probably alignment and return value from strcpy) 0x08048446 <+27>: sub $0x8,%esp # Push the parameter address 0x08048449 <+30>: push %eax # Get the address of the local buffer 0x0804844a <+31>: lea -0x12(%ebp),%eax # Push it 0x0804844d <+34>: push %eax # Call strcpy 0x0804844e <+35>: call 0x80482f0 <strcpy@plt> # Remove 4 words - 2 for arguments and 2 for return + alignment 0x08048453 <+40>: add $0x10,%esp # Make space for 3 words - alignment + return value 0x08048456 <+43>: sub $0xc,%esp # Push the printf argument address (the string address) 0x08048459 <+46>: push $0x8048510 # Call printf 0x0804845e <+51>: call 0x8048300 <puts@plt> # Remove 4 words - 1 for parameter and previous 3 0x08048463 <+56>: add $0x10,%esp # Reset 0x0 just because 0x08048466 <+59>: mov $0x0,%eax # Load previously saved address of argc 0x0804846b <+64>: mov -0x4(%ebp),%ecx # not sure about that leave... 0x0804846e <+67>: leave # Reload $esp starting value 0x0804846f <+68>: lea -0x4(%ecx),%esp # Pop the RET address - this one should be changed to # pointer to malicious code => 0x08048472 <+71>: ret
- Is the value in line +7 unnecessary? I don't see any use for it, so why is it stored?
- In some places sp moves more than it has to - is it due to alignment? (e.g. line +14)
- Is my conclusion over line +71 correct?
解决方案Disclaimer: I am using gcc-4.8.3 on a Windows 7 system with gnuwin32 installed. Windows doesn't appear to have ASLR enabled by default, so I get reproducible memory addresses when I run this program which makes life a bit easier. Also, if you follow this the memory address you get will, in all probability, be different.
Now consider this program:
#include <string.h> void copyinput(char* input) { char buf[10]; strcpy(buf, input); } int main(int argc, char** argv) { int a = 5; copyinput(argv[1]); a = 7; return 0; }
which we can compile with this command line:
gcc -g -ansi -pedantic -Wall overflow2.c -o overflow
and then run the program under gdb.
We place a break point at `main' and set the command line argument to "AAAAAAAAAABBBBBBBBBBCCCCCCCCCC" and note the following:
first note the disassembly of main:
0x0040157a <+0>: push %ebp 0x0040157b <+1>: mov %esp,%ebp => 0x0040157d <+3>: and $0xfffffff0,%esp 0x00401580 <+6>: sub $0x20,%esp 0x00401583 <+9>: call 0x401fd0 <__main> 0x00401588 <+14>: movl $0x5,0x1c(%esp) 0x00401590 <+22>: mov 0xc(%ebp),%eax 0x00401593 <+25>: add $0x4,%eax 0x00401596 <+28>: mov (%eax),%eax 0x00401598 <+30>: mov %eax,(%esp) 0x0040159b <+33>: call 0x401560 <copyinput> 0x004015a0 <+38>: movl $0x7,0x1c(%esp) 0x004015a8 <+46>: mov $0x0,%eax 0x004015ad <+51>: leave 0x004015ae <+52>: ret 0x004015af <+53>: nop
what we are interested in here is the address of the next instruction after we call
copyinput
. This will be the value of eip that gets pushed on the stack when control flow is passed tocopyinput
.lets look at the registers:
(gdb) info reg eax 0x1 1 ecx 0x752c1162 1965822306 edx 0xa02080 10494080 ebx 0x2 2 esp 0x28fea0 0x28fea0 ebp 0x28fec8 0x28fec8 esi 0xa01858 10491992 edi 0x1f 31 eip 0x401590 0x401590 <main+22> eflags 0x202 [ IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x53 83 gs 0x2b 43
we are interested in esp and ebp from the above. Remember that ebp should also get pushed onto the stack during the function call to
copyinput
.Single-step to the invocation of
copyinput
and then step into that function. At this point, look at the registers (before the call tostrcpy
) again:(gdb) info reg eax 0x9218b0 9574576 ecx 0x752c1162 1965822306 edx 0x922080 9576576 ebx 0x2 2 esp 0x28fe70 0x28fe70 ebp 0x28fe98 0x28fe98 esi 0x921858 9574488 edi 0x1f 31 eip 0x401566 0x401566 <copyinput+6> eflags 0x202 [ IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x53 83 gs 0x2b 43
What we can see here is that the stack frame for
copyinput
is from 0x28fe70 to 0x28fe98, and referring back to point (2) we can see that the stack frame formain
is based at 0x28fec8.We can examine the stack from 0x28fe70 to 0x28fec8 (a total of 88 bytes) like this:
(gdb) x/88xb 0x28fe70 0x28fe70: 0x50 0x15 0x40 0x00 0xdc 0x00 0x00 0x00 0x28fe78: 0xff 0xff 0xff 0xff 0x30 0x60 0x44 0x00 0x28fe80: 0x03 0x00 0x00 0x00 0x8c 0xfe 0x28 0x00 0x28fe88: 0x00 0x00 0x00 0x00 0x8f 0x17 0x40 0x00 0x28fe90: 0x50 0x1f 0x40 0x00 0x1c 0x50 0x40 0x00 0x28fe98: 0xc8 0xfe 0x28 0x00 0xa0 0x15 0x40 0x00 0x28fea0: 0xb0 0x18 0x92 0x00 0x00 0x50 0x40 0x00 0x28fea8: 0x88 0xff 0x28 0x00 0xae 0x1f 0x40 0x00 0x28feb0: 0x50 0x1f 0x40 0x00 0x60 0x00 0x00 0x40 0x28feb8: 0x1f 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x28fec0: 0x58 0x17 0x92 0x00 0x1f 0x00 0x00 0x00
The raw memory dump is not very easy to read, so lets collapse the bytes into words, and convert the byte order into big-endian, and we can see where certain values are located at:
0x28fe70: 0x00401550 <- esp for `copyinput` 0x000000dc 0x28fe78: 0xffffffff 0x00446030 0x28fe80: 0x00000003 0x0028fe8c 0x28fe88: 0x00000000 0x0040178f 0x28fe90: 0x00401f50 0x0040501c 0x28fe98: 0x0028fec8 <- stored *ebp* for `main``s stack frame 0x004015a0 <- stored *eip*, 0x28fea0: 0x009218b0 <- esp for `main``s stack frame 0x00405000
So from this we can see that the stored eip is located on the stack at address 0x28fe9C. From this you can see that eip gets pushed onto the stack first then ebp get pushed on the stack.
Now single stepping till after the call to string copy and examining memory again shows:
(gdb) x/88xb 0x28fe70 0x28fe70: 0x86 0xfe 0x28 0x00 0xb0 0x18 0x92 0x00 0x28fe78: 0xff 0xff 0xff 0xff 0x30 0x60 0x44 0x00 0x28fe80: 0x03 0x00 0x00 0x00 0x8c 0xfe 0x41 0x41 0x28fe88: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x28fe90: 0x42 0x42 0x42 0x42 0x42 0x42 0x42 0x42 0x28fe98: 0x42 0x42 0x43 0x43 0x43 0x43 0x43 0x43 0x28fea0: 0x43 0x43 0x43 0x43 0x00 0x50 0x40 0x00 0x28fea8: 0x88 0xff 0x28 0x00 0xae 0x1f 0x40 0x00 0x28feb0: 0x50 0x1f 0x40 0x00 0x60 0x00 0x00 0x40 0x28feb8: 0x1f 0x00 0x00 0x00 0x05 0x00 0x00 0x00 0x28fec0: 0x58 0x17 0x92 0x00 0x1f 0x00 0x00 0x00
and we can see that both the stored values of ebp and eip have been clobbered on the stack. Now when we return from
copyinput
which will pop the value for eip (which is now 0x43434343) and ebp (which is now 0x43434242) off the stack and attempt to execute the instruction at 0x43434343; which will obviously generate an exception.The main thrust of a stack attack like this would be to arrange it so that we overwrite eip with a valid value of our choosing. For example, consider the following program:
#include <stdio.h> #include <string.h> void copyinput(char* input) { char buf[10]; strcpy(buf, input); } void testinput() { printf("we should never see this\n"); } int main(int argc, char** argv) { int a = 5; copyinput(argv[1]); a = 7; return 0; }
The function
testinput
is never called. However if we can overwrite the return address incopyinput
with the value of 0x0040157a (which is the location oftestinput
on my machine) we would be able to cause that function to execute.================================================================================= answers for questions made in the comments:
Not sure what OS/compiler you are using. I took your sample program compiled it using gcc-4.8.3 on a Windows 7 box. My disassembly for main looks like this:
(gdb) disass main Dump of assembler code for function main: 0x00401560 <+0>: push %ebp 0x00401561 <+1>: mov %esp,%ebp 0x00401563 <+3>: and $0xfffffff0,%esp 0x00401566 <+6>: sub $0x20,%esp 0x00401569 <+9>: call 0x401fc0 <__main>
This is the preamble for main in which we are setting up the stack frame for main. We push the base-pointer of the previous stack frame (from some function provided by the run-time library), then move the the base pointer to where the stack point is. Next with we adjust esp to make it evenly divisible by 16 and then we subtract 32 bytes (0x20) from esp (remember that the stack grows down, so we now have some space that main is going to use.
The common pattern of
push %ebp
,mov %esp, %ebp
and thensub xxx, %esp
is a common preamble for a function.Lets try to find where things are located in memory, shall we. In gdb we can do the following:
(gdb) x/16xb &argv[0] 0xa31830: 0x58 0x18 0xa3 0x00 0x98 0x18 0xa3 0x00 0xa31838: 0x00 0x00 0x00 0x00 0xab 0xab 0xab 0xab
Which is what we expect, two 32-bit pointers followed by a null terminator. So argv[0] is located at 0x00a31858 and argv1 is located at 0x00a31898; which can be seen by examining the memor at these two locations:
(gdb) x/20cb 0x00a31858 0xa31858: 100 'd' 58 ':' 92 '\\' 117 'u' 115 's' 101 'e' 114 'r' 115 's' 0xa31860: 92 '\\' 103 'g' 104 'h' 117 'u' 98 'b' 101 'e' 114 'r' 92 '\\' 0xa31868: 71 'G' 78 'N' 85 'U' 72 'H' (gdb) x/20xb 0x00a31898 0xa31898: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0xa318a0: 0x41 0x41 0x00 0xab 0xab 0xab 0xab 0xab 0xa318a8: 0xab 0xab 0xab 0xfe
We can find out where our buffer is located but doing the following in GDB:
(gdb) print $esp $4 = (void *) 0x28fea0 (gdb) print $ebp $5 = (void *) 0x28fec8 (gdb) x/40xb $esp 0x28fea0: 0xb6 0xfe 0x28 0x00 0x98 0x18 0xa3 0x00 0x28fea8: 0x88 0xff 0x28 0x00 0x9e 0x1f 0x40 0x00 0x28feb0: 0x40 0x1f 0x40 0x00 0x60 0x00 0x41 0x41 0x28feb8: 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x41 0x28fec0: 0x00 0x17 0xa3 0x00 0x0b 0x00 0x00 0x00
So we can see that our buffer starts at 0x28feb6
Now that we have that out of the way, lets looks at the next section of our code, which should be setting up for the call to
strcpy
:0x0040156e <+14>: mov 0xc(%ebp),%eax 0x00401571 <+17>: add $0x4,%eax 0x00401574 <+20>: mov (%eax),%eax 0x00401576 <+22>: mov %eax,0x4(%esp) 0x0040157a <+26>: lea 0x16(%esp),%eax 0x0040157e <+30>: mov %eax,(%esp) 0x00401581 <+33>: call 0x402748 <strcpy>
As a reminder, in AT&T assembly syntax the address operand looks like this:
displacement(base register, offset register, scalar multiplier)
which is equivalent to the intel syntax:
[base register + displacement + offset register * scalar multiplier]
So with that,
0x0040156e <+14>: mov 0xc(%ebp),%eax 0x00401571 <+17>: add $0x4,%eax 0x00401574 <+20>: mov (%eax),%eax 0x00401576 <+22>: mov %eax,0x4(%esp)
We add 0x0C to our current base pointer which give a value of 0x28FED4, and we then copy what that contained at that memory address to eax. By using GDB, we can find out that the four bytes located at
0x08FEC4
is0x00a31830
which is the address of argv[0]. Adding four to eax causes eax to now point to argv1. The next two instructions effectively move the address of argv1 to four bytes above esp.0x0040157a <+26>: lea 0x16(%esp),%eax 0x0040157e <+30>: mov %eax,(%esp)
Continuing along, we increment esp by 0x16 (which gives us 0x28FEB6, which we have previously determined to be where
buf[10]
is located. We then move this value to where esp is at. At this time, our stack now looks like:~ ~ | | +------------+ 0x28fea4 | 0x00a31898 | remember that this is the address of argv[1][0] +------------+ 0x28fea0 | 0x0028feb6 | remember that this is the address of buf[0] +------------+
Which makes sense given that the function prototype for
strcpy
which is:char* strcpy(char* dst, const char* src);
And that typically, arguments are pushed onto the stack from right to left, so we would expect that the
src
gets pushed on first and thendst
would get pushed on second. So instead of just pushing the arguments onto the stack, the compiler set aside enough space so that it could load the required values at the correct place. So everything is in place and we can now callstrcpy
.The next few instructions just set up the call to
printf
(well actuallyputs
), we need to move the address of the string "Done.\n" onto the stack and then callputs
:0x00401586 <+38>: movl $0x404024,(%esp) 0x0040158d <+45>: call 0x402750 <puts>
Finally, we move the return value into eax (which is the register that normally contains the return value from a function) and then we exit
main
.0x00401592 <+50>: mov $0x0,%eax 0x00401597 <+55>: leave 0x00401598 <+56>: ret
Not sure if I answered all of your questions, but I think I did. Also I hope I didn't screw up the analysis too much, I don't normally do that in depth analysis of assembly or use AT&T syntax.
=============== edit2 ===================================
Three remaining questions:
Is the value in line +7 unnecessary? I don't see any use for it, so why is it stored?
Your analysis that we are pushing the original, un-aligned value of esp appears correct. My hunch is that in the first few lines of your disassembly we are looking at special start-up code for main. Remember that there was a stack frame on the stack prior to the creation of the stack frame for main. You may want to take a look at this link to see what the normal start-up order of a program under Linux is.
My hunch is that we need to preserve the unmodified value of esp so that we can restore an earlier stack frame to its correct location.
In some places sp moves more than it has to - is it due to alignment? (e.g. line +14)
I would make the analysis that these lines are where we are actually setting up the stack frame for main. In
main+14
we are subtracting 20 bytes from esp so we are allocating 20 bytes for use by our main function. We can argue that 12 of those bytes are used by our buffer (remember that there probably will be two bytes of padding at the end of our buffer so that the next value stored on the stack will be at a 32-bit word boundary).0x08048435 <+10>: push %ebp 0x08048436 <+11>: mov %esp,%ebp 0x08048438 <+13>: push %ecx 0x08048439 <+14>: sub $0x14,%esp
So, I would claim that
main+10
throughmain+14
are the normal function prologIs my conclusion over line +71 correct?
Yes. At this point we need to have overwritten the stored eip on the stack and this will cause the RET instruction to read our value. The following description of the RET instruction is taken from from here (actually this page has a good deal of information on assembly it is well worth reading. The only down side is that this page uses the Intel syntax and you have been presenting AT&T syntax.)
call, ret — Subroutine call and return
These instructions implement a subroutine call and return. The call instruction first pushes the current code location onto the hardware supported stack in memory (see the push instruction for details), and then performs an unconditional jump to the code location indicated by the label operand. Unlike the simple jump instructions, the call instruction saves the location to return to when the subroutine completes.
The ret instruction implements a subroutine return mechanism. This instruction > first pops a code location off the hardware supported in-memory stack (see the pop instruction for details). It then performs an unconditional jump to the retrieved code location.
Syntax call <label> ret
Additional information on the LEAVE instruction (used at
main+67
) is (taken from here):Releases the stack frame set up by an earlier ENTER instruction. The LEAVE instruction copies the frame pointer (in the EBP register) into the stack pointer register (ESP), which releases the stack space allocated to the stack frame. The old frame pointer (the frame pointer for the calling procedure that was saved by the ENTER instruction) is then popped from the stack into the EBP register, restoring the calling procedure's stack frame.
A RET instruction is commonly executed following a LEAVE instruction to return program control to the calling procedure.
See "Procedure Calls for Block-Structured Languages" in Chapter 6 of the IA-32 Intel Architecture Software Developer's Manual, Volume 1, for detailed information on the use of the ENTER and LEAVE instructions.
N.B. it is possible to change the flavor of the disassembly emitted by GDB by using the following commands:
set disassembly-flavor att set disassembly-flavor intel show disassembly-flavor
The third command shows what the current flavor is.
PS I second Jesters comment in his answer below. Moving the actual vulnerable code off to a function rather then in main will make the analysis easier as you don't have to deal with the weirdness of the alignment of main and the unique prolog and epilog for main. Once you've gotten a handle on this type of stack exploitation you can then go back and work through an example where the vulnerability is in main.
PPS working on a Linux system you may also run into ASLR issues, in that every time you run a program things are a different memory locations, so the offsets between stack frames and stack frame locations will change. You can use the following short program (taken from The Shellcoder's Handbook: Discovering and Exploiting Security Holes by Chris Anley, et.al) to see if ASLR is an issue
#include <stdio.h> unsigned long find_start(void) { __asm__("movl %esp, %eax"); } int main() { printf("0x%x\n", find_start()); return (0); }
Run the program several times, if the output differs you have some version of ASLR running. It will make your life more difficult, but not insurmountable
这篇关于如何禁用可能的堆栈溢出保护(EIP没有被覆盖,EBP是)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!