watchdog pet in kernel module

Peter Teoh htmldeveloper at
Wed Dec 4 20:50:17 EST 2013

On Thu, Dec 5, 2013 at 9:06 AM, Vipul Jain <vipulsj at> wrote:

> On Wed, Dec 4, 2013 at 4:57 PM, <Valdis.Kletnieks at> wrote:
>> On Wed, 04 Dec 2013 16:45:44 -0800, Vipul Jain said:
>> > If you don't mind can you please provide me more insight as what can be
>> > false alarm I can encounter to move pet inside kernel module?
>> The issue isn't false alarms - it's failure to alarm when it should.
>> The problem is that it's possible for a kernel to get wedged in such a
>> way that
>> a kernel thread is still able to feed the watchdog timer on a regular
>> basis,
>> but userspace is effectively hung and unable to proceed.  For example, if
>> an
>> OOPS happens while a filesystem lock is held, all future userspace
>> references
>> to that filesystem (and possibly all filesystems of the same type) will
>> hang,
>> eventually strangling the box while the kernel is still perfectly able to
>> keep
>> the watchdog working.
>> Hi Valdis,
> I see what you are saying but what if the user process that's feeding the
> dog gets hung and rest of the system is fine then it will bring the whole
> system down won't it? I basically want to avoid this?
Normally the process that feed the dog, is a simple process that JUST
periodically set the watchdog device descriptor.    Yes, one main() with a
while loop just periodically resetting the descriptor.

And so it is is not able to respond in time, by inference, OTHER PROCESS
must have hung.   In other system i saw there is a mother process that
monitor a few (not all) of its key child process .... so perhaps one child
will have one variable to signal to the mother that it is running.   If not
responding in time, the mother will clean up everything and then purposely
not setting the watchdog, resulting in reboot.

> Regards,
> Vipul.

Peter Teoh
-------------- next part --------------
An HTML attachment was scrubbed...

More information about the Kernelnewbies mailing list