如果没有bash脚本监控过程与sendmail [英] Bash script to monitor process and sendmail if failed

查看:104
本文介绍了如果没有bash脚本监控过程与sendmail的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我认识到,我不能可靠地在PS指望| grep或者变种准确地告诉我什么PID开始。不过,我知道我需要临时直到这个问题将在下一版本中得到解决。

我有一个名为Foo的过程,是父,TEST1和TEST2是子进程。如果TEST1和/或TEST2死了美孚将继续运行,并且不会重生这是需要正常工作TEST1和/或TEST2。我怎么知道这是因为该程序重新启动TEST1和/或TEST2需要美孚首先被重新启动。

所以,当我想监视子进程中,如果失败sendemail它失败,然后重新启动该服务,并发送它再次启动另一封电子邮件。我计划通过cron来运行该脚本每5分钟。

支票独立工作也是如此sendmail的。问题是,当我创建一个如果else语句。当TEST1 TEST2或死它仍然会记录它在运行时,它不是。有人可以帮助我在此请。

 #!/斌/庆典
#检查,如果进程正在运行
VAL1 =`在/ usr / UCB / PS辅助| grep的[P] ROCESS TEST1>的/ dev / null`
VAL2 =`在/ usr / UCB / PS辅助| grep的[P] ROCESS TEST2>的/ dev / null`
如果$ VAL1&放大器;&安培; $ VAL2;然后
回声$(日期) - $ VAL1和放大器; $ VAL2正在运行>> /var/tmp/Log.txt;
其他
SUBJ =进程已停止
FROM =服务器
TO =someone@acme.com

猫<< !
要:$ {TO}
来源:$ {FROM}
主题:$ {} SUBJ

猫<< !
在$ VAL1和VAL2 $下降为$(日期),请登录到服务器重新启动

)| sendmail的-v $ {TO}
ELSEIF
/ usr / sbin目录/ svcadm禁用富;
等待10;
/ usr / sbin目录/ svcadm使富;
科幻


解决方案

所以,有一件事对你的测试是,你推输出到的/ dev / null的的,这意味着VAL1和val2永远是空的。

其次,你不需要ELIF。你有两个基本条件。无论事情正在运行,或者他们没有。如果有什么不运行,发送电子邮件。你可以做一些额外的测试,以确定它是否是过程TEST1或TEST2过程是死了,但不会严格必要的。

下面是我如何可能会写一个脚本做同样的事情。

 #!的/ usr / bin中/ env的庆典#检查,如果进程正在运行
PID1 = $(在/ usr / UCB / PS AUX | grep的[P] ROCESS TEST1| awk的'{$打印2})
PID2 = $(在/ usr / UCB / PS AUX | grep的[P] ROCESS TEST2| awk的'{$打印2})ERR = 0如果[X $ PID1==X];然后
        #过程TEST1死亡
        ERR = $((ERR + 1))
其他
        回声$(日期) - 过程TEST1 $ VAL2正在运行>> /var/tmp/Log.txt;
科幻如果[X $ PID2==X];然后
        #过程TEST2死亡
        ERR = $((ERR + 2))
其他
        回声$(日期) - 过程TEST2正在运行>> /var/tmp/Log.txt;
科幻如果(($犯错大于0));然后
        #确定哪些流程测试有问题。
        如果$((ERR == 1));然后
                条件=PROCESS TEST1已关闭
        ELIF(($犯错== 2));然后
                条件=PROCESS TEST2已关闭
        其他
                条件=过程TEST1和TEST2过程下降
        科幻        #让我们发送电子邮件获得的眼睛的问题,但我们会后,重新启动该进程
        #我们发送电子邮件。
        SUBJ =过程检测到错误
        FROM =服务器
        TO =someone@acme.com
        (
        猫<< -EOT
        要:$ {TO}
        来源:$ {FROM}
        主题:$ {} SUBJ        在$ $条件(日期),请登录到服务器,检查过程已成功重新启动。        EOT
        )| sendmail的-v $ {TO}        #我们达到了一个错误条件,我们发送的邮件
        #现在让我们重新启动SVC。
        / usr / sbin目录/ svcadm重启富
科幻

I realize that I can’t reliably count on ps | grep or variants to accurately tell me what PID is started. However I know what I need for interim until this problem is resolved in the next release.

I have a process named Foo that is the parent, TEST1 and TEST2 are the child processes. If TEST1 and/or TEST2 dies off Foo will continue to run and will not respawn TEST1 and/or TEST2 which is needed to function properly. How do I know this because the program to restart TEST1 and/or TEST2 requires Foo to be restarted first.

So when I want to monitor a child process, if failed sendemail that it failed then restart the service and send another email that it is started again. I plan to run the script via CRON every 5 minutes.

The check works independently and so does the sendmail. The problem is when I create a if else statement. When TEST1 or TEST2 dies it still logs that it is running when it is not. Can someone help me on this please.

#!/bin/bash
#Check if process is running
VAL1=`/usr/ucb/ps aux | grep "[P]ROCESS TEST1" >/dev/null`
VAL2=`/usr/ucb/ps aux | grep "[P]ROCESS TEST2" >/dev/null`
if $VAL1 && $VAL2; then
echo "$(date) - $VAL1 & $VAL2 is Running" >> /var/tmp/Log.txt;
else
SUBJ="Process has stopped"
FROM="Server"
TO="someone@acme.com"
(
cat << !
To : ${TO}
From : ${FROM}
Subject : ${SUBJ}
!
cat << !
The $VAL1 and $VAL2 went down at $(date) please login to the server to restart
!
) | sendmail -v ${TO}
elseif
/usr/sbin/svcadm disable Foo;
wait 10;
/usr/sbin/svcadm enable Foo; 
fi

解决方案

So, one thing about your tests is that you're pushing the output to /dev/null, which means that VAL1 and VAL2 will always be empty.

Secondly, you don't need the elif. You have two basic conditions. Either things are running, or they are not. If anything is not running, send an email. You could do some additional testing to determine whether it's PROCESS TEST1 or PROCESS TEST2 that died, but that wouldn't strictly be necessary.

Here's how I might write a script to do the same thing.

#!/usr/bin/env bash

#Check if process is running
PID1=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST1" | awk '{print $2}')
PID2=$(/usr/ucb/ps aux | grep "[P]ROCESS TEST2" | awk '{print $2}')

err=0

if [ "x$PID1" == "x" ]; then
        # PROCESS TEST1 died
        err=$(( err + 1 ))
else
        echo "$(date) - PROCESS TEST1 $VAL2 is Running" >> /var/tmp/Log.txt;
fi

if [ "x$PID2" == "x" ]; then
        # PROCESS TEST2 died
        err=$(( err + 2 ))
else
        echo "$(date) - PROCESS TEST2  is Running" >> /var/tmp/Log.txt;
fi

if (( $err > 0 )); then
        # identify which PROCESS TEST had the problem.
        if $(( err == 1 )); then
                condition="PROCESS TEST1 is down"
        elif (( $err == 2 )); then
                condition="PROCESS TEST2 is down"
        else
                condition="PROCESS TEST1 and PROCESS TEST2 are down"
        fi

        # let's send an email to get eyes on the issue, but we will restart the process after
        # we send the email.
        SUBJ="Process Error Detected"
        FROM="Server"
        TO="someone@acme.com"
        (
        cat <<-EOT
        To : ${TO}
        From : ${FROM}
        Subject : ${SUBJ}

        $condition at $(date) please login to the server to check that the processes were restarted successfully.

        EOT
        ) | sendmail -v ${TO}

        # we reached an error condition, and we sent mail
        # now let's restart the svc.
        /usr/sbin/svcadm restart Foo
fi

这篇关于如果没有bash脚本监控过程与sendmail的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆