应用程序无故被杀死.怀疑BSS高.如何调试呢? [英] Application is getting killed without any reason. Suspecting high BSS. How to debug it?

查看:197
本文介绍了应用程序无故被杀死.怀疑BSS高.如何调试呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在CentOs6.6中成功运行了我的应用程序.最近,硬件(主板和RAM)已更新,我的应用程序现在无缘无故被杀死.

I have been running my application successfully in CentOs6.6. Recently, the hardware(motherboard and RAM) was updated and my application is getting killed now without any reason at all.

[root@localhost PktBlaster]# ./PktBlaster
Killed

文件和ldd输出

[root@localhost PktBlaster]# file PktBlaster
PktBlaster: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped

[root@localhost PktBlaster]# ldd PktBlaster
not a dynamic executable

strace的输出

[root@localhost PktBlaster]# strace ./PktBlaster
execve("./PktBlaster", ["./PktBlaster"], [/* 30 vars */] <unfinished ...>
+++ killed by SIGKILL +++
Killed

GDB

[root@localhost PktBlaster]# gdb PktBlaster
(gdb) break main
Breakpoint 1 at 0x43d664: file VTP.c, line 544.
(gdb) run
Starting program: /root/Veryx/PktBlaster/PktBlaster 
During startup program terminated with signal SIGKILL, Killed.

在调试时,观察到bss内存很大(〜6GB).该系统具有4GB RAM,我认为这可能是问题的原因.

While debugging, observed that the bss memory is huge(~6GB). The system has 4GB RAM and I think this could be the reason for the issue.

[root@localhost PktBlaster_1Gig]# size build/unix/bin/PktBlaster 
   text    data     bss     dec     hex filename
 375551   55936 6747541120  6747972607  19235e3ff   build/unix/bin/PktBlaster

该应用程序包含许多.h文件和许多数据结构,因此对我来说很难确定为什么将BSS提高到6GB.

The application contains many .h files and many datastructures and so it is difficult for me to identify why BSS is been raised to 6GB.

任何人都可以建议如何识别引起此问题的文件吗?或其他任何更简便的方法来调试它?

Could anyone please suggest how to identify which file is causing this? or any other easier way to debug this?

推荐答案

似乎问题确实是巨大的BSS大小. 我已要求您在注释中显示LD_DEBUG=all /lib64/ld-linux-x86-64.so.2 /path/to/exe的输出.

It seems that problem really is huge BSS size. I have asked you to show output of LD_DEBUG=all /lib64/ld-linux-x86-64.so.2 /path/to/exe in comments.

/lib64/ld-linux-x86-64.so.2是运行时链接程序,操作系统使用它在execve系统调用期间将二进制文件加载到进程内存中.运行时链接程序负责解析可执行文件格式,将所有节和依赖项加载到内存中,执行所有必需的重定位等. 将环境变量LD_DEBUG设置为全部,我们指示运行时链接程序生成调试输出.

/lib64/ld-linux-x86-64.so.2 is runtime linker which is used by OS to load your binary in process memory during execve system call. Runtime linker is responsible for parsing executable format, loading all sections and dependencies in memory, performing all required relocations and so on. Setting environment variable LD_DEBUG to all we instruct runtime linker to generate debug output.

[root @ localhost PktBlaster]#LD_DEBUG =全部/lib64/ld-linux-x86-64.so.2 /root/Veryx/PktBlaster/PktBlaster 851:文件=/root/Veryx/PktBlaster/PktBlaster [0];生成链接图 /root/Veryx/PktBlaster/PktBlaster:加载共享时出错 库:/root/Veryx/PktBlaster/PktBlaster:无法映射零填充 页:无法分配内存

[root@localhost PktBlaster]# LD_DEBUG=all /lib64/ld-linux-x86-64.so.2 /root/Veryx/PktBlaster/PktBlaster 851: file=/root/Veryx/PktBlaster/PktBlaster [0]; generating link map /root/Veryx/PktBlaster/PktBlaster: error while loading shared libraries: /root/Veryx/PktBlaster/PktBlaster: cannot map zero-fill pages: Cannot allocate memory

在运行时链接程序的源代码中搜索此错误消息(glibc-2.17 elf/dl-load.c,第〜1400行),我们看到:

Searching for this error message in source code of runtime linker(glibc-2.17 elf/dl-load.c, lines ~1400) we see:

1393         if (zeroend > zeropage)
1394           {
1395         /* Map the remaining zero pages in from the zero fill FD.  */
1396         caddr_t mapat;
1397         mapat = __mmap ((caddr_t) zeropage, zeroend - zeropage,
1398                 c->prot, MAP_ANON|MAP_PRIVATE|MAP_FIXED,
1399                 -1, 0);
1400         if (__builtin_expect (mapat == MAP_FAILED, 0))
1401           {
1402             errstring = N_("cannot map zero-fill pages");
1403             goto call_lose_errno;
1404           }

dl-loader正在加载BSS段,该段通过优化以二进制格式存储为仅字节数,必须将其初始化为零.加载程序尝试通过mmap零初始化内存块(MAP_ANONYMOUS)进行分配,并从OS中获取错误:

dl-loader is in process of loading BSS segment, which by optimizations is stored in binary format as just number of bytes, that must be initialized to zero. Loader tries to allocate through mmap zero initialized memory block(MAP_ANONYMOUS) and get error from the OS:

 15 #define ENOMEM      12  /* Out of memory */

来自2 mmap的人:

From man 2 mmap:

ENOMEM没有可用的内存,或进程的最大数量 映射将被超出.

ENOMEM No memory is available, or the process's maximum number of mappings would have been exceeded.

因此,无论出于何种原因,操作系统似乎都无法满足加载程序对内存的请求.要么使用了一些限制(系统限制,进程限制,一些安全性LKM等等),要么内核中没有足够的可用内存.

So it seems that for whatever reason OS cannot fulfill loader request for memory. Either some limits are used(systemd, process limit, some security LKM, whatever) or simply there are not enough free memory in kernel.

要确定哪个目标文件生成了BSS的大部分内容,请使用

To determine what object file generates most part of the BSS - use

objdump -j '.bss' -t *.o 

这篇关于应用程序无故被杀死.怀疑BSS高.如何调试呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆