Strace中的Apache内存损坏 [英] Apache Memory Corruption in Strace

查看:58
本文介绍了Strace中的Apache内存损坏的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

不确定这是SF还是SO问题.

Unsure as to whether this is a SF or SO problem.

在Centos 5.10上的负载平衡环境Apache 2.2.26中,非常(非常)繁忙的LAMP服务器.

Very (very) busy LAMP servers in load balanced environment Apache 2.2.26 on Centos 5.10.

我正在尝试查找与挂起httpd进程有关的代码或系统问题.这些呼叫永远处于W,正在发送答复"状态,TCP连接保持保持活动状态,并且永远不会触发apache的超时.最终,我们积累了足够多的挂起进程,以至于我不得不反弹httpd进程.

I'm trying to trace down a code or systems issue with hanging httpd processes. These calls sit in W, "Sending Reply" status forever, the tcp connections remain in keepalive and the apache's timeout never fires. Eventually we accumulate enough hanging processes that I have to bounce the httpd process.

这是strace的结尾,它似乎包围了所有挂起的呼叫.我真的不确定下一步该怎么做.似乎Apache正在尝试向控制台写入信息,但我不确定这是否正常.但是,malloc错误肯定指向某些错误.任何帮助(甚至是荒唐的想法)都表示赞赏.

This is the end of an strace that seems to surround all of hung calls. I'm really unsure as to where to go next with this. Seems Apache is trying to write out to the console, and I'm not sure if that's normal. But the malloc error definitely points to something wrong. Any help (even wild ideas) appreciated.


lstat("/tmp", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=12288, ...}) = 0

lstat("/tmp/promo.log", {st_mode=S_IFREG|0644, st_size=14778558, ...}) = 

open("/tmp/promo.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 27

fstat(27, {st_mode=S_IFREG|0644, st_size=14778558, ...}) = 0

lseek(27, 0, SEEK_CUR)                  = 0

lseek(27, 0, SEEK_CUR)                  = 0

write(27, "20140314065931 : cartitem::calcu"..., 68) = 68

close(27)                               = 0

open("/dev/tty", O_RDWR|O_NOCTTY|O_NONBLOCK) = -1 ENXIO (No such device or address)

writev(2, [{"*** glibc detected *** ", 23}, {"/usr/local/apache-2.2.26/bin/htt"..., 34}, {": ", 2}, {"double free or corruption (!prev"..., 33}, {": 0x", 4}, {"0000000005ebfb20", 16}, {" ***\n", 5}], 7) = 117

open("/usr/local/apache-2.2.26/lib/libgcc_s.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)

open("/usr/lib64/tls/libgcc_s.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)

open("/usr/lib64/libgcc_s.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)

open("/usr/local/apache-2.2.26/lib/libgcc_s.so.1", O_RDONLY) = -1 ENOENT (No such file or 
directory)

open("/etc/ld.so.cache", O_RDONLY)      = 27

fstat(27, {st_mode=S_IFREG|0644, st_size=93743, ...}) = 0

mmap(NULL, 93743, PROT_READ, MAP_PRIVATE, 27, 0) = 0x2ba82af8b000
close(27)                               = 0

open("/lib64/libgcc_s.so.1", O_RDONLY)  = 27

read(27, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\36`T8\0\0\0"..., 832) = 832

fstat(27, {st_mode=S_IFREG|0755, st_size=58400, ...}) = 0

open("/dev/tty", O_RDWR|O_NOCTTY|O_NONBLOCK) = -1 ENXIO (No such device or address)

writev(2, [{"*** glibc detected *** ", 23}, {"/usr/local/apache-2.2.26/bin/htt"..., 34}, {": ", 2}, {"malloc(): memory corruption", 27}, {": 0x", 4}, {"0000000005fc1e70", 16}, {" ***\n", 5}], 7) = 111

futex(0x2ba822f1dfc0, FUTEX_WAIT_PRIVATE, 9, NULL) = -1 EINTR (Interrupted system call)
--- SIGTERM (Terminated) @ 0 (0) ---

futex(0x2ba822f1b9e0, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EINTR (Interrupted system call)
--- SIGCONT (Continued) @ 0 (0) ---

futex(0x2ba822f1b9e0, FUTEX_WAIT_PRIVATE, 2, NULL

推荐答案

如果这是物理内存错误,则可能会在vm主机日志中看到它.我的猜测是这是一个鲁棒的请求或脚本错误,正在使内存超载.由于您正在监视进程状态,因此我假设您已经启用了ExtendedStatus.

If it was a physical memory error you would probably see it in your vm host logs. My guess is that it is a rouge request or script error that is overloading memory. Since you are monitoring the process status, I am assuming you already have ExtendedStatus enabled.

是否可能存在应用程序级别错误?也许没有弹出的递归函数调用.挂起的进程的请求路径中是否有任何模式?您可能会考虑在应用程序级别记录PID并请求数据,以查看是否发出了在该级别触发错误的请求.

Is it possible you have an application level error? Perhaps a recursive function call that isn't popping. Is there any pattern in the request paths on the hung processes? You might consider logging PIDs and request data at the application level to see if there is a request being made that is triggering an error at that level.

此外,mod_log%D(请求时间(以微秒为单位))以及路径可能会帮助您缩小罪魁祸首.

Also, mod_log %D (request time in microseconds) along with the path may help you narrow down the culprit.

这篇关于Strace中的Apache内存损坏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆