KILL only if not already dead
Process restart scripts on Unix will normally send TERM (-15) signal via kill(1) or kill(2) then move on, or send a brutal KILL (-9). Neither approach should be used: a TERM signal may leave a process running, and a KILL must only be sent as a last resort. KILL prevents cleanup of shared memory, temporary files, and other open resources. Instead, send a TERM signal, then check whether the process has exited properly. If not, only then use the KILL signal.
The following script will help test processing killing code. The script ignores the default TERM signal, requiring some other signal to stop the process, such as INT (-2, or ctrl+c) or KILL. The perlipc documentation contains more information on signal handling in Perl.
#!/usr/bin/perl -l print $$; $SIG{TERM} = 'IGNORE'; $SIG{INT} = sub { print "whoa"; exit }; sleep 3 while 1;
GNU ps contains options to match running processes, such as:
pid_check=`ps ho pid $pid` if [ -z "$pid_check" ]; then echo "info: process not running: pid=$pid" fi
However, these options do not work on other ps(1) implementations. Inspecting the result of kill -0 $pid should be more portable, though must be tested on each new flavor of Unix. Example shell code to kill a process and ensure it exits:
#!/bin/sh # Returns 0 if supplied pid not found, # 1 if still running. Back-off delay # allows slow processes to spin down. confirm_process_exit () { PID=$1 for delay in 0 1 2 3 5 8; do echo -n . sleep $delay if ! kill -0 $PID >/dev/null; then return 0 fi done return 1 } # Kill process, then ensure exits kill $1 confirm_process_exit $1 STATUS=$? if [ $STATUS -eq 1 ]; then kill -9 $1 fi
The Portable Shell Programming book covers ps(1) and other shell portability concerns. If possible, use Perl or another modern language, as the loop handling code around kill(2) will be more testable and portable than the equivalent shell code.
Slow to exit applications will also require special handling, as they may take upwards of a minute to spin down. Java embedded with Oracle… uggh.
On a somewhat related note, ensure new application code load tested before seeing production use. Systems often exhibit unexpected behavior under heavy CPU or memory load.