叉和线程核心转储 [英] Fork and core dump with threads

查看:147
本文介绍了叉和线程核心转储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

相似点,一个在这个问题之前已提出这里和的here ,我知道谷歌核心转储库(我已经鉴定,发现缺少的,虽然我可能会尝试和工作的,如果我理解这个问题更好)。

我想获得一个运行Linux进程的核心转储不中断的过程。自然的方法是说:

(!叉())

 如果{中止(); }

由于分叉过程得到原进程内存的一个固定的快照副本,我应该得到一个完整的核心转储,而且由于副本使用写入时复制,它一般应便宜。但是,这种方法的一个重要缺点是叉()仅派生当前线程,和原来的进程的所有其他线程不会在分叉副本存在。

我的问题是是否有可能以某种方式获得,另外,原来的线程的相关数据。我不完全知道如何处理这个问题,但这里有几个小问题,我想出了:


  1. 是包含所有仍然可用和可访问的派生进程线程的筹码的记忆?


  2. 是否有可能(quicky)枚举所有正在运行的线程在原来的处理和存储他们的筹码的基地的地址?据我了解,在Linux线程栈的基础上包含一个指向内核线程的簿记数据,所以......


  3. 与所存储的线程基地址,则可以读出相关的数据中的每个叉过程原始线程


如果这是可能的,或许只会是其他线程的数据附加到核心转储的问题。但是,如果该数据在叉已经点丢失,则似乎没有成为这种做法的希望。


解决方案

您熟悉过程检查点重启?特别是, CRIU ?在我看来,它可能会为你提供一个容易的选择。


  

欲获得一个运行Linux过程的核心转储而不中断过程[和]以某种方式获得,另外,原线程的相关数据。


忘掉不中断过程。如果你想想看,一个核心转储的必须的中断转储的时间进程;因此,你的真正目标应该是尽量减少这种中断的时间。您原来使用叉的想法()不中断过程中,它只是做等了很短的时间。


  

      
  1. 是包含所有仍然可用和可访问的派生进程线程的筹码?
  2. 的记忆
      

没有。在叉()仅保留执行实际调用的线程,并且为线程的其余部分失去了筹码。

下面是我使用的过程中,假设CRIU是不合适的:


  • 有一个生成每当孩子停止子进程的核心转储父进程。 (请注意,可以产生多于一个的连续停止事件;只有第一个,直到下一个继续事件应当采取行动)

    您可以检测到停止/继续使用事件<一个href=\"http://man7.org/linux/man-pages/man2/waitpid.2.html\"><$c$c>waitpid(child,,WUNTRACED|WCONTINUED).


  • 可选:使用<一个href=\"http://man7.org/linux/man-pages/man2/sched_setaffinity.2.html\"><$c$c>sched_setaffinity()到工艺限制到一个CPU,和<一href=\"http://man7.org/linux/man-pages/man2/sched_setscheduler.2.html\"><$c$c>sched_setscheduler() (也许 sched_setparam() )下降的过程中优先 IDLE

    您可以从父进程,只需要 CAP_SYS_NICE 功能(你可以使用 setcap'CAP_SYS_NICE = PE'给它这样做家长二进制父二进制文件,如果有像目前大多数Linux发行版启用了文件系统的功能一样),同时在有效集和允许集。

    的意图是,当所有的主题已停止以最小化的时刻之间的其他线程的线程决定它希望的快照/转储的进展,和力矩。我没有测试需要多长时间才能使更改生效 - 当然,他们只是在最早期在当前时间片的结尾发生。所以,这一步可能应该有点提前完成。

    就个人而言,我不打扰。在我四核的机器,下面的 SIGSTOP 单独产生类似延迟线程之间的互斥量或信号呢,所以我看不出有任何需要争取更好同步。


  • 在子进程中的一个线程决定它要采取自身的快照,它发送一个 SIGSTOP 来本身(通过杀号(getpid(),SIGSTOP))。这将停止所有线程的过程中。

    父进程将收到孩子停止的通知。它会首先检查 / proc /进程/任务/ 来获得子进程的每个线程的TID(也许 / proc /进程/任务/ TID / pseudofiles其他信息),然后使用连接到每个TID ptrace的(PTRACE_ATTACH,TID) 。显然, ptrace的(PTRACE_GETREGS,TID,...) 将获得每线程寄存器的状态,它可与 / proc /进程/任务/ TID / smaps 和<$ C一起使用$ C> / proc /进程/任务/ TID / MEM 来获取每线程的堆栈跟踪,以及其它信息,你感兴趣的(例如,你可以创建一个调试器兼容核心文件为每个线程)。

    当父进程完成敛转储,它让子进程继续下去。我相信你需要发送一个单独的 SIGCONT 信号让整个子进程继续下去,而不是仅仅依赖于 ptrace的(PTRACE_CONT,TID),但我没有检查这一点;做验证这一点,请。


我相信,上述将产生在过程中停止线程之间的挂钟时间的最小延迟。关于Xubuntu的AMD速龙II X4 640和3.8.0-29,通用内核快速测试表明紧密循环递增的其他线程volatile变量只有几千推进计数器,这取决于线程数(有太多在几个测试噪音我做再多说什么更具体的)。

限制的过程中一单一的CPU,和甚至为IDLE优先权,将大大甚至进一步减少延迟。 CAP_SYS_NICE 能力允许家长不仅可以减少子进程的优先级,同时也解除了优先恢复原有水平;文件系统功能意味着父进程甚至不必须是setuid的,因为 CAP_SYS_NICE 单独就足够了。 (我觉得这是足够安全 - 与父程序一些很好的检查 - 安装在例如大学电脑,学生都在寻找有趣的方式利用已安装的程序相当活跃。)

有可能创建一个内核补丁(或模块),提供了一个提振杀号(getpid(),SIGSTOP)也尝试从开球的其他线程运行的CPU,从而尽量使线程停止甚至更小之间的延迟。就个人而言,我不会理会。即使没有CPU /优先级处理我获得足够的同步(线程都停止时间之间足够小的延迟)。

你需要一些例如code来说明我上面的想法?

Similar points to the one in this question have been raised before here and here, and I'm aware of the Google coredump library (which I've appraised and found lacking, though I might try and work on that if I understand the problem better).

I want to obtain a core dump of a running Linux process without interrupting the process. The natural approach is to say:

if (!fork()) { abort(); }

Since the forked process gets a fixed snapshot copy of the original process's memory, I should get a complete core dump, and since the copy uses copy-on-write, it should generally be cheap. However, a critical shortcoming of this approach is that fork() only forks the current thread, and all other threads of the original process won't exist in the forked copy.

My question is whether it is possible to somehow obtain the relevant data of the other, original threads. I'm not entirely sure how to approach this problem, but here are a couple of sub-questions I've come up with:

  1. Is the memory that contains all of the threads' stacks still available and accessible in the forked process?

  2. Is it possible to (quicky) enumerate all the running threads in the original process and store the addresses of the bases of their stacks? As I understand it, the base of a thread stack on Linux contains a pointer to the kernel's thread bookkeeping data, so...

  3. with the stored thread base addresses, could you read out the relevant data for each of the original threads in the forked process?

If that is possible, perhaps it would only be a matter of appending the data of the other threads to the core dump. However, if that data is lost at the point of the fork already, then there doesn't seem to be any hope for this approach.

解决方案

Are you familiar with process checkpoint-restart? In particular, CRIU? It seems to me it might provide an easy option for you.

I want to obtain a core dump of a running Linux process without interrupting the process [and] to somehow obtain the relevant data of the other, original threads.

Forget about not interrupting the process. If you think about it, a core dump has to interrupt the process for the duration of the dump; your true goal must therefore be to minimize the duration of this interruption. Your original idea of using fork() does interrupt the process, it just does so for a very short time.

  1. Is the memory that contains all of the threads' stacks still available and accessible in the forked process?

No. The fork() only retains the thread that does the actual call, and the stacks for the rest of the threads are lost.

Here is the procedure I'd use, assuming CRIU is unsuitable:

  • Have a parent process that generates a core dump of the child process whenever the child is stopped. (Note that more than one consecutive stop event may be generated; only the first one until the next continue event should be acted on.)

    You can detect the stop/continue events using waitpid(child,,WUNTRACED|WCONTINUED).

  • Optional: Use sched_setaffinity() to restrict the process to a single CPU, and sched_setscheduler() (and perhaps sched_setparam()) to drop the process priority to IDLE.

    You can do this from the parent process, which only needs the CAP_SYS_NICE capability (which you can give it using setcap 'cap_sys_nice=pe' parent-binary to the parent binary, if you have filesystem capabilities enabled like most current Linux distributions do), in both the effective and permitted sets.

    The intent is to minimize the progress of the other threads between the moment a thread decides it wants a snapshot/dump, and the moment when all threads have been stopped. I have not tested how long it takes for the changes to take effect -- certainly they only happen at the end of their current timeslices at the very earliest. So, this step should probably be done a bit beforehand.

    Personally, I don't bother. On my four-core machine, the following SIGSTOP alone yields similar latencies between threads as a mutex or a semaphore does, so I don't see any need to strive for even better synchronization.

  • When a thread in the child process decides it wants to take a snapshot of itself, it sends a SIGSTOP to itself (via kill(getpid(), SIGSTOP)). This stops all threads in the process.

    The parent process will receive the notification that the child was stopped. It will first examines /proc/PID/task/ to obtain the TIDs for each thread of the child process (and perhaps /proc/PID/task/TID/ pseudofiles for other information), then attaches to each TID using ptrace(PTRACE_ATTACH, TID). Obviously, ptrace(PTRACE_GETREGS, TID, ...) will obtain the per-thread register states, which can be used in conjunction with /proc/PID/task/TID/smaps and /proc/PID/task/TID/mem to obtain the per-thread stack trace, and whatever other information you're interested in. (For example, you could create a debugger-compatible core file for each thread.)

    When the parent process is done grabbing the dump, it lets the child process continue. I believe you need to send a separate SIGCONT signal to let the entire child process continue, instead of just relying on ptrace(PTRACE_CONT, TID), but I haven't checked this; do verify this, please.

I do believe that the above will yield a minimal delay in wall clock time between the threads in the process stopping. Quick tests on AMD Athlon II X4 640 on Xubuntu and a 3.8.0-29-generic kernel indicates tight loops incrementing a volatile variable in the other threads only advance the counters by a few thousand, depending on the number of threads (there's too much noise in the few tests I made to say anything more specific).

Limiting the process to a single CPU, and even to IDLE priority, will drastically reduce that delay even further. CAP_SYS_NICE capability allows the parent to not only reduce the priority of the child process, but also lift the priority back to original levels; filesystem capabilities mean the parent process does not even have to be setuid, as CAP_SYS_NICE alone suffices. (I think it'd be safe enough -- with some good checks in the parent program -- to be installed in e.g. university computers, where students are quite active in finding interesting ways to exploit the installed programs.)

It is possible to create a kernel patch (or module) that provides a boosted kill(getpid(), SIGSTOP) that also tries to kick off the other threads from running CPUs, and thus try to make the delay between the threads stopping even smaller. Personally, I wouldn't bother. Even without the CPU/priority manipulation I get sufficient synchronization (small enough delays between the times the threads are stopped).

Do you need some example code to illustrate my ideas above?

这篇关于叉和线程核心转储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆