Recovering Linux system from hung state via software

Mandeep Sandhu mandeepsandhu.chd at gmail.com
Wed Dec 4 03:13:38 EST 2013


> assuming one mother process is monitoring 10 child process, so inside each
> child process, simply just setup a PERIODIC (eg, per 5 sec) mechanism to
> toggle a binary variables through IPC means.   It will be reset when the
> mother process go around checking all the variable status and, if not reset
> it therefore implies that the particular process might be hung.    it can
> wait further, or continue checking other process.   at the end of checking
> ALL the process, if everything is OK, it should feed the kernel watchdog
> timer.   if the kernel watchdog timer is not reset, the kernel module will
> then reboot the system.   (ie, reboot is from kernel module).

Hold on! Why should we reboot the whole system if only some of these
processes are misbehaving?!?! Why should other processes suffer due
this? Wouldn't it be better to just kill the erroneous process (like
how most OS's anyway do, eg: "Force Quit" in Ubuntu, or chrome tabs).

Or are these processes the only ones running on the system?

-mandeep



More information about the Kernelnewbies mailing list