如果连接到strace，挂起的进程会恢复 [英] Hung processes resume if attached to strace

查看：210 发布时间：2018/4/21 15:01:23 sockets linux-kernel gdb strace ptrace

本文介绍了如果连接到strace，挂起的进程会恢复的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个使用TCP套接字用C编写的网络程序。有时客户端程序会永远挂起，期待从服务器输入。具体而言，客户端挂在fd上的select（）调用集上，用于读取服务器发送的字符。

我使用strace来知道进程卡在哪里。但是，有时当我挂上挂起的客户端进程时，它会立即恢复执行并正常退出。并非所有挂起的进程都表现出这种行为，即使我将它们附加到strace中，一些进程仍然卡在select（）中。但是大多数进程在连接到strace时会恢复执行。

我很好奇当连接到strace时导致进程恢复的原因。它可能会让我知道为什么客户端进程被挂起。

任何想法？什么原因导致挂起的进程在连接到strace时恢复执行？

更新：

以下是输出strace on hung processes。

 > sudo strace -p 25645 
进程25645附加 - 中断退出
 --- SIGSTOP（停止（信号））@ 0（0）--- 
 --- SIGSTOP（停止（信号） ））@ 0（0）--- 
 [过程PID = 25645以32位模式运行。 ] 
 select（5，\ 0，8192）= 1 $ b $（
 select（6，[3 5]，NULL，NULL，NULL）= 2（in [3 5] b写（2，，0）= 0 
读（3，====设置set_oldtempbehaio...，8192）= 555 
写（1，====设置（6，[3 5]，NULL，NULL，NULL）= 2（in [3 5]）
 read（5，，8192 ）= 0 
 read（3，，8192）= 0 
 close（5）= 0 
 kill（25652，SIGKILL）= 0 
 exit_group（0）=？ 
进程25645分离

 > sudo strace -p 14462 
过程14462附加 - 中断退出
 [过程PID = 14462以32位模式运行。 ] 
 read（0，0xff85fdbc，8192）= -1 EIO（输入/输出错误）
 shutdown（3，1 / * send * /）= 0 
 exit_group（0）=？

 > sudo strace -p 7517 
过程7517附加 - 中断退出
 --- SIGSTOP（停止（信号））@ 0（0）--- 
 --- SIGSTOP（停止））@ 0（0）--- 
 [过程PID = 7517以32位模式运行。 ] 
 connect（3，{sa_family = AF_INET，sin_port = htons（300），sin_addr = inet_addr（100.64.220.98）}，16）= -1 ETIMEDOUT（连接超时）
 close 3）= 0 
 dup（2）= 3 
 fcntl64（3，F_GETFL）= 0x1（flags O_WRONLY）
 close（3）= 0 
 write（2，dsd13 ：连接超时\\\
，30）= 30 
 write（2，Error code：110\\\
，17）= 17 
 rt_sigprocmask（SIG_SETMASK，[]，NULL，8） = 0 
 exit_group（1）=？ 
过程7517分离

不仅仅是select（），而且过程（同一个程序）在我将它们附加到strace之前，它们被困在各种系统调用中。他们在附加strace后突然恢复。如果我不附加他们strace，他们只是永远挂在那里。

更新2：

我了解到strace可能会启动一个先前已停止的进程（T进程中的进程）。现在我正试图理解为什么这些过程进入'T'状态，原因是什么。这里是/ proc //状态信息：

> cat / proc / 12554 / status 名称：某人州：T（已停止） SleepAVG：88％ Tgid：12554 pid：12554 PPid：9754 TracerPid：0 Uid：5000 5000 5000 5000 Gid：48986 48986 48986 48986 FDSize：256 组：9149 48986 VmPeak ：1992 kB VmSize：1964 kB VmLck：0 kB VmHWM：608 kB VmRSS：608 kB VmData：156 kB VmStk：20 kB VmExe：16 kB VmLib：1744 kB VmPTE：20 kB 主题：1 SigQ：54/73728 SigPnd：0000000000000000 ShdPnd：0000000000000000 SigBlk：0000000000000000 SigIgn：0000000000000006 SigCgt：0000000000004000 CapInh：0000000000000000 CapPrm：0000000000000000 CapEff：0000000000000000 Cpus_allowed：00000000,00000000,00000000,0000000f Mems_allowed：00000000,00000001

解决方案
strace 使用 ptrace 。 ptrace手册页包含以下内容：
由于附件发送SIGSTOP，并且跟踪器通常会禁止它，所以可能会导致从当前正在执行的系统中返回一个零散的EINTR在tracee中，如信号注入和抑制部分所述。
您是否看到 select return EINTR ？

I have a network program written in C using TCP sockets. Sometimes the client program hangs forever expecting input from server. Specifically, the client hangs on select() call set on an fd intended to read characters sent by server.

I am using strace to know where the process got stuck. However, sometimes when I attach the hung client process to strace, it immediately resumes it's execution and properly exits. Not all hung processes exhibit this behavior, some processes stuck in the select() even if I attach them to strace. But most of the processes resume their execution when attached to strace.

I am curious what causing the processes resume when attached to strace. It might give me clues to know why client processes are getting hung.

Any ideas? what causes a hung process to resume it's execution when attached to strace?

Update:

Here's the output of strace on hung processes.
> sudo strace -p 25645 Process 25645 attached - interrupt to quit --- SIGSTOP (Stopped (signal)) @ 0 (0) --- --- SIGSTOP (Stopped (signal)) @ 0 (0) --- [ Process PID=25645 runs in 32 bit mode. ] select(6, [3 5], NULL, NULL, NULL) = 2 (in [3 5]) read(5, "\0", 8192) = 1 write(2, "", 0) = 0 read(3, "====Setup set_oldtempbehaio"..., 8192) = 555 write(1, "====Setup set_oldtempbehaio"..., 555) = 555 select(6, [3 5], NULL, NULL, NULL) = 2 (in [3 5]) read(5, "", 8192) = 0 read(3, "", 8192) = 0 close(5) = 0 kill(25652, SIGKILL) = 0 exit_group(0) = ? Process 25645 detached
_
> sudo strace -p 14462 Process 14462 attached - interrupt to quit [ Process PID=14462 runs in 32 bit mode. ] read(0, 0xff85fdbc, 8192) = -1 EIO (Input/output error) shutdown(3, 1 /* send */) = 0 exit_group(0) = ?
_
> sudo strace -p 7517 Process 7517 attached - interrupt to quit --- SIGSTOP (Stopped (signal)) @ 0 (0) --- --- SIGSTOP (Stopped (signal)) @ 0 (0) --- [ Process PID=7517 runs in 32 bit mode. ] connect(3, {sa_family=AF_INET, sin_port=htons(300), sin_addr=inet_addr("100.64.220.98")}, 16) = -1 ETIMEDOUT (Connection timed out) close(3) = 0 dup(2) = 3 fcntl64(3, F_GETFL) = 0x1 (flags O_WRONLY) close(3) = 0 write(2, "dsd13: Connection timed out\n", 30) = 30 write(2, "Error code : 110\n", 17) = 17 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 exit_group(1) = ? Process 7517 detached
Not just select(), but the processes(of same program) are stuck in various system calls before I attach them to strace. They suddenly resume after attaching to strace. If I don't attach them to strace, they just hang there forever.

Update 2:

I learned that strace could start a process which was previously stopped (process in T sate). Now I am trying to understand why did these processes go to 'T' state, what's the cause. Here's the /proc//status information:
> cat /proc/12554/status Name: someone State: T (stopped) SleepAVG: 88% Tgid: 12554 Pid: 12554 PPid: 9754 TracerPid: 0 Uid: 5000 5000 5000 5000 Gid: 48986 48986 48986 48986 FDSize: 256 Groups: 9149 48986 VmPeak: 1992 kB VmSize: 1964 kB VmLck: 0 kB VmHWM: 608 kB VmRSS: 608 kB VmData: 156 kB VmStk: 20 kB VmExe: 16 kB VmLib: 1744 kB VmPTE: 20 kB Threads: 1 SigQ: 54/73728 SigPnd: 0000000000000000 ShdPnd: 0000000000000000 SigBlk: 0000000000000000 SigIgn: 0000000000000006 SigCgt: 0000000000004000 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 Cpus_allowed: 00000000,00000000,00000000,0000000f Mems_allowed: 00000000,00000001

解决方案
strace uses ptrace. The ptrace man page has this:
Since attaching sends SIGSTOP and the tracer usually suppresses it, this may cause a stray EINTR return from the currently executing system call in the tracee, as described in the "Signal injection and suppression" section.
Are you seeing select return EINTR?

这篇关于如果连接到strace，挂起的进程会恢复的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如果连接到strace，挂起的进程会恢复 [英] Hung processes resume if attached to strace

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如果连接到strace，挂起的进程会恢复 [英] Hung processes resume if attached to strace

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭