Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

kill command and Unix Signals

News Linux process management Recommended Books Recommended Links How to kill all processes for a particular user pkill command pgrep ps command
pstree kill command killall Controlling System Processes Unix Signals systemd Zombies Xinetd
process table SIGKILL SIGTERM Admin Horror Stories bash Tips and Tricks Tips Humor Etc

Contrary to its name kill is a Unix command to send signals to running processes.  One possibility is to request the termination of this process and that's the only case, when the name suits.  Usually kill is a standalone utility, but some shells like Bash have built-in kill  command. 

A signal is a message that one process sends to another when some abnormal event takes place or when it wants the other process to perform some action. Most of the time, a process send a signal to a subprocess it created. This is somewhat similar to the idea of communicating with the other process via I/O pipeline; it is just another way for processes to communicate with each other. That's why signal and pipes are subclasses of a more general concept called interprocess communication, or IPC.

Pipes and signals were available since the days of early UNIX. System V and 4.x BSD in addition implement  sockets, named pipes, and shared memory.

Depending on the version of UNIX, there are two or three dozen types of signals, including a few that can be used for whatever purpose a programmer wishes. Signals have numbers (from 1 to the number of signals the system supports) and names. The latter can be retrieved by using that command kill -l , which lists all signals available on the system, their number and symbolic names.   For example:

    kill -l 
	1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL 
	5) SIGTRAP 6) SIGABRT 7) SIGBUS SIGFPE
	9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2 
	13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT 
	17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP 
	21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU 
	25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH 
	29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN 
	35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4 
	39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8 
	43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12 
	47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14 
	51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10 
	55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6 
	59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2 
	63) SIGRTMAX-1 64) SIGRTMAX

Questions related to handing of signals are very popular at interviews so sysadmins need periodically refresh their knowledge of the topic even if they do not use it too much (and most don't).

Signals are asynchronous events that can occur to a running process and may be caused by hardware, software or users. In Unix various signals that can be send to processes are encoded with integer codes. The specific mapping between numbers and signals can slightly vary between Unix implementations. But signals SIGHUP, SIGINT, SIGTRAP, SIGKILL and SIGTERM have identical codes on majority of Unix flavors. For example, SIGTERM are always numbered 15, while SIGKILL is numbered 9. The numeric value for SIGKILL is defined in the header file signal.h.

When a process receives a signal, that process must respond to the signal. Uncaught signals will cause default actions to take place, which often means the process is terminated.

! By default, kill sends to the process the termination signal (SIGTERM -- 15) , which requests that the process exit. This is a frequent interview question for Unix admins, so you better remember this...

kill name is something of a misnomer; the signal sent may have nothing to do with process killing. The kill  command is a wrapper around the kill() system call.

There are many different signals that can be sent (see signal for a full list), although the signals in which users are generally most interested are SIGTERM and SIGKILL.

All signals except for SIGKILL and SIGSTOP can be "intercepted" by the process, meaning that a special function can be called when the program receives those signals. The two exceptions SIGKILL and SIGSTOP are only seen by the host system's kernel, providing reliable ways of controlling the execution of processes. SIGKILL kills the process, and SIGSTOP pauses it until a SIGCONT is received.

Unix provides security mechanisms to prevent unauthorized users from killing other processes. Essentially, for a process to send a signal to another, the owner of the signaling process must be the same as the owner of the receiving process or be the superuser.

Control-key Signals

When you type If you type CTRL-C, you tell the shell to send the INT (for "interrupt") signal to the current job; CTRL-Z sends TSTP (on most systems, for "terminal stop"). You can also send the current job a QUIT signal by typing CTRL-\ (control-backslash); this is sort of like a "stronger" version of CTRL-C.   You would normally use CTRL-\ when (and only when) CTRL-C doesn't work.

CTRL-\ can also cause the shell to leave a file called core in your current directory. This file contains an image of the process to which you sent the signal; a programmer could use it to help debug the program that was running. The file's name is a (very) old-fashioned term for a computer's memory. Other signals leave these "core dumps" as well; you should feel free to delete them unless a systems programmer tells you otherwise.

As we'll see soon, there is also a "panic" signal called KILL -9 that you can send to a process when even CTRL-C doesn't work. But it isn't attached to any control key, which means that you can't use it to stop the currently running process. INT, TSTP, and QUIT are the only signals you can use with control keys.

Some BSD-derived systems have additional control-key signals.

You can customize the control keys used to send signals with options of the stty(1) command. These vary from system to system-consult your man page for the command-but the usual syntax is stty signame char. signame is a name for the signal that, unfortunately, is often not the same as the names we use here.  For example, to set your INT key to [CTRL-X] on most systems, use:

stty intr ^X

Most of the other signals are used by the operating system to advise processes of error conditions, like a bad machine code instruction, bad memory address, or division by zero, or "interesting" events such as a user logging out or a timer ("alarm") going off. The remaining signals are used for esoteric error conditions that are of interest only to low-level systems programmers; newer versions of UNIX have more and more arcane signal types.

Zombies

A "zombie process," that is, a child process that has terminated, but that the parent process has not (yet) killed, cannot be killed by a logged-on user -- you can't kill something that is already dead -- but init will generally clean it up sooner or later.

Formally a zombie process or defunct process is a process that has completed execution but still has an entry in the process table. This entry is still needed to allow the process that started the (now zombie) process to read its exit status. The term zombie process derives from the common definition of zombie—an undead person. In the term's metaphor, the child process has "died" but has not yet been "reaped". Also, unlike normal processes, the kill command has no effect on a zombie process.

When a process ends, all of the memory and resources associated with it are deallocated so they can be used by other processes. However, the process's entry in the process table remains. The parent can read the child's exit status by executing the wait  system call, at which stage the zombie is removed. The wait  call may be executed in sequential code, but it is commonly executed in a handler for the SIGCHLD signal, which the parent receives whenever a child has died.

After the zombie is removed, its process ID and entry in the process table can then be reused. However, if a parent fails to call wait, the zombie will be left in the process table. In some situations this may be desirable, for example if the parent creates another child process it ensures that it will not be allocated the same process ID. On modern UNIX-like systems (that comply with SUSv3 specification in this respect), the following special case applies: if the parent explicitly ignores SIGCHLD by setting its handler to SIG_IGN  (rather than simply ignoring the signal by default) or has the SA_NOCLDWAIT  flag set, all child exit status information will be discarded and no zombie processes will be left.

A zombie process is not the same as an orphan process. An orphan process is a process that is still executing, but whose parent has died. They do not become zombie processes; instead, they are adopted by init  (process ID 1), which waits on its children.

Zombies can be identified in the output from the Unix ps  command by the presence of a “Z” in the “STAT” column. Zombies that exist for more than a short period of time typically indicate a bug in the parent program, or just an uncommon decision to reap children (see example). If the parent program is no longer running, zombie processes typically indicate a bug in the operating system. As with other leaks, the presence of a few zombies is not worrisome in itself, but may indicate a problem that would grow serious under heavier loads. Since there is no memory allocated to zombie processes except for the process table entry itself, the primary concern with many zombies is not running out of memory, but rather running out of process ID numbers.

To remove zombies from a system, the SIGCHLD signal can be sent to the parent manually, using the kill command. If the parent process still refuses to reap the zombie, the next step would be to remove the parent process. When a process loses its parent, init becomes its new parent. Init periodically executes the wait  system call to reap any zombies with init as parent.

Examples

A process can be sent a SIGTERM signal in four ways (the process ID is '1234' in this case):

kill  1234
kill -s  TERM 1234
kill -TERM <  1234
kill -15 1234 

The process can be sent a SIGKILL signal in three ways:

kill -s  KILL 1234
kill -KILL  1234
kill -9 1234

Other useful signals include SIGHUP, SIGTRAP, and SIGINT

It is also common for CTRL+Z to be mapped to SIGTSTP, and for CTRL+\ (backslash) to be mapped to SIGQUIT, which can force a program to do a core dump.


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Jul 29, 2019] A Guide to Kill, Pkill and Killall Commands to Terminate a Process in Linux

Jul 26, 2019 | www.tecmint.com
... ... ...

How about killing a process using process name

You must be aware of process name, before killing and entering a wrong process name may screw you.

# pkill mysqld
Kill more than one process at a time.
# kill PID1 PID2 PID3

or

# kill -9 PID1 PID2 PID3

or

# kill -SIGKILL PID1 PID2 PID3
What if a process have too many instances and a number of child processes, we have a command ' killall '. This is the only command of this family, which takes process name as argument in-place of process number.

Syntax:

# killall [signal or option] Process Name

To kill all mysql instances along with child processes, use the command as follow.

# killall mysqld

You can always verify the status of the process if it is running or not, using any of the below command.

# service mysql status
# pgrep mysql
# ps -aux | grep mysql

That's all for now, from my side. I will soon be here again with another Interesting and Informative topic. Till Then, stay tuned, connected to Tecmint and healthy. Don't forget to give your valuable feedback in comment section.

[Oct 27, 2017] Neat trick of using su command for killing all processes for a particular user

Oct 27, 2017 | unix.stackexchange.com

If you pass -1 as the process ID argument to either the kill shell command or the kill C function , then the signal is sent to all the processes it can reach, which in practice means all the processes of the user running the kill command or syscall.

su -c 'kill -TERM -1' bob

In C (error checking omitted):

if (fork() == 0) {
    setuid(uid);
    signal(SIGTERM, SIG_DFL);
    kill(-1, SIGTERM);
}

[Oct 27, 2017] c - How do I kill all a user's processes using their UID - Unix Linux Stack Exchange

Oct 27, 2017 | unix.stackexchange.com

osgx ,Aug 4, 2011 at 10:07

Use pkill -U UID or pkill -u UID or username instead of UID. Sometimes skill -u USERNAME may work, another tool is killall -u USERNAME .

Skill was a linux-specific and is now outdated, and pkill is more portable (Linux, Solaris, BSD).

pkill allow both numberic and symbolic UIDs, effective and real http://man7.org/linux/man-pages/man1/pkill.1.html

pkill - ... signal processes based on name and other attributes

    -u, --euid euid,...
         Only match processes whose effective user ID is listed.
         Either the numerical or symbolical value may be used.
    -U, --uid uid,...
         Only match processes whose real user ID is listed.  Either the
         numerical or symbolical value may be used.

Man page of skill says is it allowed only to use username, not user id: http://man7.org/linux/man-pages/man1/skill.1.html

skill, snice ... These tools are obsolete and unportable. The command syntax is poorly defined. Consider using the killall, pkill

  -u, --user user
         The next expression is a username.

killall is not marked as outdated in Linux, but it also will not work with numberic UID; only username: http://man7.org/linux/man-pages/man1/killall.1.html

killall - kill processes by name

   -u, --user
         Kill only processes the specified user owns.  Command names
         are optional.

I think, any utility used to find process in Linux/Solaris style /proc (procfs) will use full list of processes (doing some readdir of /proc ). I think, they will iterate over /proc digital subfolders and check every found process for match.

To get list of users, use getpwent (it will get one user per call).

skill (procps & procps-ng) and killall (psmisc) tools both uses getpwnam library call to parse argument of -u option, and only username will be parsed. pkill (procps & procps-ng) uses both atol and getpwnam to parse -u / -U argument and allow both numeric and textual user specifier.

; ,Aug 4, 2011 at 10:11

pkill is not obsolete. It may be unportable outside Linux, but the question was about Linux specifically. – Lars Wirzenius Aug 4 '11 at 10:11

Petesh ,Aug 4, 2011 at 10:58

to get the list of users use the one liner: getent passwd | awk -F: '{print $1}' – Petesh Aug 4 '11 at 10:58

; ,Aug 4, 2011 at 12:07

what about I give a command like: "kill -ju UID" from C system() call? – user489152 Aug 4 '11 at 12:07

osgx ,Aug 4, 2011 at 15:01

is it an embedded linux? you have no skill, pkill and killall? Even busybox embedded shell has pkill and killall. – osgx Aug 4 '11 at 15:01

michalzuber ,Apr 23, 2015 at 7:47

killall -u USERNAME worked like charm – michalzuber Apr 23 '15 at 7:47

[Mar 05, 2012] Unknown Bash Tips and Tricks For Linux Linux.com

I've had some troubles with bleeding-edge releases of KMail; it hangs and doesn't want to close by normal means. It spawns a single process, which we can see with the ps command:

ps axf|grep kmail
 2489 ?     Sl  1:44 /usr/bin/kmail -caption KMail

You can start out gently and try this:

$ kill 2489

This sends the default SIGTERM (signal terminate) signal, which is similar to the SIGINT (signal interrupt) sent from the keyboard with Ctrl+c. So what if this doesn't work? Then you amp up your stopping power and use SIGKILL, like this:

$ kill -9 2489

This is the nuclear option and it will work. As the relevant section of the GNU C manual says: "The SIGKILL signal is used to cause immediate program termination. It cannot be handled or ignored, and is therefore always fatal. It is also not possible to block this signal." This is different from SIGTERM and SIGINT and other signals that politely ask processes to terminate. They can be trapped and handled in different ways, and even blocked, so the response you get to a SIGTERM depends on how the program you're trying to kill has been programmed to handle signals. In an ideal world a program responds to SIGTERM by tidying up before exiting, like finishing disk writes and deleting temporary files. SIGKILL knocks it out and doesn't give it a chance to do any cleanup. (See man 7 signal for a complete description of all signals.)

So what's special about Bash kill over GNU /bin/kill? My favorite is how it looks when you invoke the online help summary:

$ help kill

Another advantage is it can use job control numbers in addition to PIDs. In this modern era of tabbed terminal emulators job control isn't the big deal it used to be, but the option is there if you want it. The biggest advantage is you can kill processes even if they have gone berserk and maxed out your system's process number limit, which would prevent you from launching /bin/kill. Yes, there is a limit, and you can see what it is by querying /proc:

$ cat /proc/sys/kernel/threads-max
61985

With Bash kill there are several ways to specify which signal you want to use. These are all the same:

$ kill 2489
$ kill -s TERM 2489
$ kill -s SIGTERM 2489
$ kill -n 15 2489

kill -l lists all supported signals.

If you spend a little quality time with man bash and the GNU Bash Manual I daresay you will learn more valuable tasks that Bash can do for you.

kill (O'Rielly)

You can use the built-in shell command kill to send a signal to any process you created-not just the currently running job. kill takes as argument the process ID, job number, or command name of the process to which you want to send the signal. By default, kill sends the TERM ("terminate") signal, which usually has the same effect as the INT signal that you send with [CTRL-C]. But you can specify a different signal by using the signal name (or number) as an option, preceded by a dash.

kill is so-named because of the nature of the default TERM signal, but there is another reason, which has to do with the way UNIX handles signals in general. The full details are too complex to go into here, but the following explanation should suffice.

Most signals cause a process that receives them to roll over and die; therefore if you send any one of these signals, you "kill" the process that receives it. However, programs can be set up to "trap" specific signals and take some other action. For example, a text editor would do well to save the file being edited before terminating when it receives a signal such as INT, TERM, or QUIT. Determining what to do when various signals come in is part of the fun of UNIX systems programming.

Here is an example of kill. Say you have a fred process in the background, with process ID 480 and job number 1, that needs to be stopped. You would start with this command:

$ kill %1

If you were successful, you would see a message like this:

[1] + Terminated                fred &

If you don't see this, then the TERM signal failed to terminate the job. The next step would be to try QUIT:

$ kill -QUIT %1

If that worked, you would see these messages:

fred[1]: 480 Quit(coredump)
[1] +  Done(131)                fred &

The 131 is the exit status returned by fred. [9] But if even QUIT doesn't work, the "last-ditch" method would be to use KILL:

[9] When a shell script is sent a signal, it exits with status 128+N, where N is the number of the signal it received (128 changes to 256 in future releases). In this case, fred is a shell script, and QUIT happens to be signal number 3.

$ kill -KILL %1

(Notice how this has the flavor of "yelling" at the runaway process.) This produces the message:

[1] + Killed                    fred &

It is impossible for a process to "trap" a KILL signal-the operating system should terminate the process immediately and unconditionally. If it doesn't, then either your process is in one of the "funny states" we'll see later in this chapter, or (far less likely) there's a bug in your version of UNIX.

Here's another example.

Task 8.1

Write a script called killalljobs that kills all background jobs.

The solution to this task is simple, relying on jobs -p:

kill "$@" $(jobs -p)

You may be tempted to use the KILL signal immediately, instead of trying TERM (the default) and QUIT first. Don't do this. TERM and QUIT are designed to give a process the chance to "clean up" before exiting, whereas KILL will stop the process, wherever it may be in its computation. Use KILL only as a last resort!

You can use the kill command with any process you create, not just jobs in the background of your current shell. For example, if you use a windowing system, then you may have several terminal windows, each of which runs its own shell. If one shell is running a process that you want to stop, you can kill it from another window-but you can't refer to it with a job number because it's running under a different shell. You must instead use its process ID.

Learn Linux, 101 Create, monitor, and kill processes

Send signals to processes

Let's now look at Linux signals, which are an asynchronous way to communicate with processes. We have already mentioned the SIGHUP signal, and we have used both Ctrl-c and Ctrl-z, which are other ways of sending a signal to processes. The general way to send a signal is with the kill command.

Sending signals using kill

The kill command sends a signal to a specified job or process. Listing 22 shows the use of the SIGTSTP and SIGCONT signals to stop and resume a background job. Using the SIGTSTP signal is equivalent to using the fg command to bring the job to the foreground and then Ctrl-z to suspend it. Using SIGCONT is like using the bg command.

Listing 22. Stopping and restarting background jobs
                
ian@attic4:~$ kill -s SIGTSTP %1

[1]+  Stopped                 xclock -d -update 1
ian@attic4:~$ jobs -l
[1]+  3878 Stopped                 xclock -d -update 1
[2]   5485 Running                 nohup sh pmc.sh &
[3]-  5487 Running                 nohup bash pmc.sh &
ian@attic4:~$ kill -s SIGCONT 3878
ian@attic4:~$ jobs -l
[1]   3878 Running                 xclock -d -update 1 &
[2]-  5485 Running                 nohup sh pmc.sh &
[3]+  5487 Running                 nohup bash pmc.sh &

We used the job specification (%1) to stop the xclock process in this example, and then the process id (PID) to restart (continue) it. If you stopped job %2 and then used tail with the -f option to follow it, you would see that only one process is updating the nohup.out file.

There are a number of other possible signals that you can display on your system using kill -l. Some are used to report errors such as illegal operation codes, floating point exceptions, or attempts to access memory that a process does not have access to. Notice that signals have both a number, such as 20, and a name, such as SIGTSTP. You may use either the number prefixed by a - sign, or the -s option and the signal name. On my system I could have used kill -20 instead of kill -s SIGTSTP. You should always check the signal numbers on your system before assuming which number belongs to which signal.

Signal handlers and process termination

You have seen that Ctrl-c terminates a process. In fact, it sends a SIGINT (or interrupt) signal to the process. If you use kill without any signal name, it sends a SIGTERM signal. For most purposes, these two signals are equivalent.

You have seen that the nohup command makes a process immune to the SIGHUP signal. In general, a process can implement a signal handler to catch signals. So a process could implement a signal handler to catch either SIGINT or SIGTERM. Since the signal handler knows what signal was sent, it may choose to ignore SIGINT and only terminate when it receives SIGTERM, for example. Listing 23 shows how to send the SIGTERM signal to job %2. Notice that the process status shows as "Terminated" right after we send the signal. This would show as "Interrupt" if we used SIGINT instead. After a few moments, the process cleanup has occurred and the job no longer shows in the job list.

Listing 23. Terminating a process with SIGTERM
ian@attic4:~$ kill -s SIGTERM %2
ian@attic4:~$ 
[2]-  Terminated              nohup sh pmc.sh
ian@attic4:~$ jobs -l
[1]-  3878 Running                 xclock -d -update 1 &
[3]+  5487 Running                 nohup bash pmc.sh &

Signal handlers give a process great flexibility. A process can do its normal work and be interrupted by a signal for some special purpose. Besides allowing a process to catch termination requests and take possible action such as closing files or checkpointing transactions in progress, signals are often used to tell a daemon process to reread its configuration file and possibly restart operation. You might do this for the inetd process when you change network parameters, or the line printer daemon (lpd) when you add a new printer.

Terminating processes unconditionally

Some signals cannot be caught, such as some hardware exceptions. SIGKILL, the most likely one you will use, cannot be caught by a signal handler and unconditionally terminates a process. In general, you should need this only if all other means of terminating the process have failed.

InformIT Solaris 10 System Administration Exam Prep Managing System Processes Using Signals

Clearing frozen processes.

Solaris supports the concept of sending software signals to a process. These signals are ways for other processes to interact with a running process outside the context of the hardware. The kill command is used to send a signal to a process. System administrators most often use the signals SIGHUP, SIGKILL, SIGSTOP, and SIGTERM. The SIGHUP signal is used by some utilities as a way to notify the process to do something, such as re-read its configuration file. The SIGHUP signal is also sent to a process if the remote connection is lost or hangs up. The SIGKILL signal is used to abort a process, and the SIGSTOP signal is used to pause a process. The SIGTERM signal is the default signal sent to processes by commands such as kill and pkill when no signal is specified. Table 5.12 describes the most common signals an administrator is likely to use.

Exam Alert

Don't worry about remembering all of the signals listed; just be familiar with the more common signals, such as SIGHUP, SIGKILL, SIGSTOP, and SIGTERM.

[Mar 02, 2011] Removing zombie processes

Nice metaphor: zombie process are just uncollected death certificate :-)
March 1, 2010 | DistroWatch.com

Dawn-of-the-data asks: What is a zombie process and how do I get rid of it?

DistroWatch answers: UNIX administrators have colourful names and descriptions for things, especially when it comes to processes. For example, when one program or process starts another process, the original is referred to as the "parent process" and the new process is called the "child process". When a child process is finished its task it "dies". The parent process is notified of its child's death and the child's information is removed from the system.

But sometimes a child process dies and the parent process doesn't stop to collect the information. When that happens, the child process itself is removed from the system, but a marker or "death certificate" is left behind, waiting to be collected. These uncollected death certificates are referred to as "zombie processes". These are rarely problems in themselves as they take up very little memory, but finding zombie processes usually means there's a bug in the parent program.

Let's say you've been monitoring your system and you've found a zombie process, what can you do about it? The first thing to do is find out which process is the parent of the zombie. You can do this by running the command

ps axo stat,ppid,pid,cmd | grep ^Z

The output will show you all zombie processes on the system with the ID number of their parent in the second column. We can then remind the parent that they have zombie children running wild by sending them a signal. Let's say that the parent process ID is 12889, for example. We could remind this process to collect its child's death certificate by running

kill -SIGCHLD 12889

However, if the parent refuses to handle the signal and collect the child's data, then we have to choose between leaving a zombie in the system and killing the parent process. When a parent process dies, any children it has are turned over to the init service. The init service regularly checks the status of its children and collects any death certificates, removing zombies from the system. We can try killing the parent nicely by asking it to terminate using

kill 12889

where 12889 is the parent's process ID. But, if the parent is stubborn and refuses to go quietly, we can force the issue by running

kill -9 12889

At that point, the parent process will be removed from the system, its children (including any zombies) are given to init and the zombies will be removed.

How do I get rid of zombie processes that persevere

From: [email protected] (Casper Dik)
Date: Thu, 09 Sep 93 16:39:58 +0200

3.13) How do I get rid of zombie processes that persevere?

Unfortunately, it's impossible to generalize how the death of child processes should behave, because the exact mechanism varies over the various flavors of Unix.

First of all, by default, you have to do a wait() for child processes under ALL flavors of Unix. That is, there is no flavor of Unix that I know of that will automatically flush child processes that exit, even if you don't do anything to tell it to do so.

Second, under some SysV-derived systems, if you do "signal(SIGCHLD, SIG_IGN)" (well, actually, it may be SIGCLD instead of SIGCHLD, but most of the newer SysV systems have "#define SIGCHLD SIGCLD" in the header files), then child processes will be cleaned up automatically, with no further effort in your part. The best way to find out if it works at your site is to try it, although if you are trying to write portable code, it's a bad idea to rely on this in any case. Unfortunately, POSIX doesn't allow you to do this; the behavior of setting the SIGCHLD to SIG_IGN under POSIX is undefined, so you can't do it if your program is supposed to be POSIX-compliant.

So, what's the POSIX way? As mentioned earlier, you must install a signal handler and wait. Under POSIX signal handlers are installed with sigaction. Since you are not interested in ``stopped'' children, only in terminated children, add SA_NOCLDSTOP to sa_flags. Waiting without blocking is done with waitpid(). The first argument to waitpid should be -1 (wait for any pid), the third should be WNOHANG. This is the most portable way and is likely to become more portable in future.

If your systems doesn't support POSIX, there's a number of ways. The easiest way is signal(SIGCHLD, SIG_IGN), if it works. If SIG_IGN cannot be used to force automatic clean-up, then you've got to write a signal handler to do it. It isn't easy at all to write a signal handler that does things right on all flavors of Unix, because of the following inconsistencies:

On some flavors of Unix, the SIGCHLD signal handler is called if one *or more* children have died. This means that if your signal handler only does one wait() call, then it won't clean up all of the children. Fortunately, I believe that all Unix flavors for which this is the case have available to the programmer the wait3() or waitpid() call, which allows the WNOHANG option to check whether or not there are any children waiting to be cleaned up. Therefore, on any system that has wait3()/waitpid(), your signal handler should call wait3()/waitpid() over and over again with the WNOHANG option until there are no children left to clean up. Waitpid() is the preferred interface, as it is in POSIX.

On SysV-derived systems, SIGCHLD signals are regenerated if there are child processes still waiting to be cleaned up after you exit the SIGCHLD signal handler. Therefore, it's safe on most SysV systems to assume when the signal handler gets called that you only have to clean up one signal, and assume that the handler will get called again if there are more to clean up after it exits.

On older systems, there is no way to prevent signal handlers from being automatically reset to SIG_DFL when the signal handler gets called. On such systems, you have to put "signal(SIGCHILD, catcher_func)" (where "catcher_func" is the name of the handler function) as the last thing in the signal handler, so that it gets reset.

Fortunately, newer implementations allow signal handlers to be installed without being reset to SIG_DFL when the handler function is called. To get around this problem, on systems that do not have wait3()/waitpid() but do have SIGCLD, you need to reset the signal handler with a call to signal() after doing at least one wait() within the handler, each time it is called. For backward compatibility reasons, System V will keep the old semantics (reset handler on call) of signal(). Signal handlers that stick can be installed with sigaction() or sigset().

The summary of all this is that on systems that have waitpid() (POSIX) or wait3(), you should use that and your signal handler should loop, and on systems that don't, you should have one call to wait() per invocation of the signal handler.

One more thing -- if you don't want to go through all of this trouble, there is a portable way to avoid this problem, although it is somewhat less efficient. Your parent process should fork, and then wait right there and then for the child process to terminate. The child process then forks again, giving you a child and a grandchild. The child exits immediately (and hence the parent waiting for it notices its death and continues to work), and the grandchild does whatever the child was originally supposed to. Since its parent died, it is inherited by init, which will do whatever waiting is needed. This method is inefficient because it requires an extra fork, but is pretty much completely portable.

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: February 19, 2020