Linux在崩溃时自动重新启动应用程序 - 守护进程 [英] Linux automatically restarting application on crash - Daemons

查看:644
本文介绍了Linux在崩溃时自动重新启动应用程序 - 守护进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个运行嵌入式linux的系统,它连续运行至关重要。基本上,这是一个与传感器进行通信并将数据中继到数据库和Web客户端的过程。



如果发生崩溃,如何自动重新启动应用程序? >

此外,还有几个线程进行轮询(例如套接字和uart通信)。如何确保没有任何线程挂起或意外退出?有没有一个易于使用的看门狗是线程友好的?

解决方案

它的要点是:


  1. 您需要检测程序是否仍在运行,而不是挂起。

  2. 您需要(重新)启动程序if程序没有运行或挂起。

有许多不同的方法来做#1,但有两个想到是:


  1. 在UNIX域套接字上侦听,以处理状态请求。外部应用程序可以查询应用程序是否仍然可以。如果在某些超时时间内没有响应,那么可以认为被查询的应用程序已经死锁或死机。


  2. 定期触摸一个文件预选路径外部应用程序可以查看文件的时间戳,如果它是陈旧的,那么它可以假定应用程序死机或死锁。


对于#2,杀死前一个PID并使用fork + exec启动新进程是典型的。您也可以考虑将您的应用程序连续运行到运行一次的应用程序中,然后使用cron或其他一些应用程序继续重新运行单一运行的应用程序。


$ b $不幸的是,看门狗计时器和摆脱僵局是不平凡的问题。我不知道是否有任何通用的方法,而我看到的几个是非常丑陋的,而不是100%的bug。但是, tsan 可以通过静态分析来帮助检测潜在的死锁场景和其他线程问题。 / p>

I have an system running embedded linux and it is critical that it runs continuously. Basically it is a process for communicating to sensors and relaying that data to database and web client.

If a crash occurs, how do I restart the application automatically?

Also, there are several threads doing polling(eg sockets & uart communications). How do I ensure none of the threads get hung up or exit unexpectedly? Is there an easy to use watchdog that is threading friendly?

解决方案

The gist of it is:

  1. You need to detect if the program is still running and not hung.
  2. You need to (re)start the program if the program is not running or is hung.

There are a number of different ways to do #1, but two that come to mind are:

  1. Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.

  2. Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.

With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.

Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.

这篇关于Linux在崩溃时自动重新启动应用程序 - 守护进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆