<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Dec 5, 2013 at 10:19 AM, Rajat Sharma <span dir="ltr"><<a href="mailto:fs.rajat@gmail.com" target="_blank">fs.rajat@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Although /dev/watchdog is available in usermode, but nothing should stop you to write to it from a kernel thread.<span class="HOEnZb"><font color="#888888"><br>
<br></font></span></div><span class="HOEnZb"><font color="#888888">Rajat<br></font></span></div></blockquote><div><br></div><div style>I don't think /dev/watchdog (literally, I meant) is available in the kernel. It is accessible in userspace, but translated to a different name in the kernel. and moreover, if u access the variable directly, bypassing all the spinlock (see drivers/watchdog and look for "wdt_lock" spinlock) that is implemented around it, u might be going into a racing condition. </div>
<div style><br></div><div style>BUT.....if u really insist probing from inside the kernel....it is not watchdog, it is "process watch", in your own way.</div><div style><br></div><div style>ie, u can always write a loop that periodically probe the status of that specific to make sure it is in RUNNING state (vs BLOCKING when it is waiting for some I/O, or locks to complete), and perhaps check the CPU instruction to make sure that it is not going into a tight loop (ie, a userspace program that literally do "while(true) {do_nothing()}....and many other possible "hung" criteria for a process as well. not easy...but extremely complex.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><span class="HOEnZb"><font color="#888888"></font></span></div><div class="gmail_extra"><br>
<br><div class="gmail_quote"><div><div class="h5">On Wed, Dec 4, 2013 at 5:50 PM, Peter Teoh <span dir="ltr"><<a href="mailto:htmldeveloper@gmail.com" target="_blank">htmldeveloper@gmail.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote"><div><div>
On Thu, Dec 5, 2013 at 9:06 AM, Vipul Jain <span dir="ltr"><<a href="mailto:vipulsj@gmail.com" target="_blank">vipulsj@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><div><div><br><br><div class="gmail_quote">On Wed, Dec 4, 2013 at 4:57 PM, <span dir="ltr"><<a href="mailto:Valdis.Kletnieks@vt.edu" target="_blank">Valdis.Kletnieks@vt.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Wed, 04 Dec 2013 16:45:44 -0800, Vipul Jain said:<br>
<br>
> If you don't mind can you please provide me more insight as what can be<br>
> false alarm I can encounter to move pet inside kernel module?<br>
<br>
</div>The issue isn't false alarms - it's failure to alarm when it should.<br>
<br>
The problem is that it's possible for a kernel to get wedged in such a way that<br>
a kernel thread is still able to feed the watchdog timer on a regular basis,<br>
but userspace is effectively hung and unable to proceed. For example, if an<br>
OOPS happens while a filesystem lock is held, all future userspace references<br>
to that filesystem (and possibly all filesystems of the same type) will hang,<br>
eventually strangling the box while the kernel is still perfectly able to keep<br>
the watchdog working.<br>
<br>
</blockquote></div></div></div>Hi Valdis,</div><div class="gmail_extra"><br></div><div class="gmail_extra">I see what you are saying but what if the user process that's feeding the dog gets hung and rest of the system is fine then it will bring the whole system down won't it? I basically want to avoid this?</div>
<div class="gmail_extra"><br></div></div></blockquote><div><br></div></div></div><div>Normally the process that feed the dog, is a simple process that JUST periodically set the watchdog device descriptor. Yes, one main() with a while loop just periodically resetting the descriptor.</div>
<div><br></div><div>And so it is is not able to respond in time, by inference, OTHER PROCESS must have hung. In other system i saw there is a mother process that monitor a few (not all) of its key child process .... so perhaps one child will have one variable to signal to the mother that it is running. If not responding in time, the mother will clean up everything and then purposely not setting the watchdog, resulting in reboot. </div>
<div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"></div><div class="gmail_extra">Regards,</div><div class="gmail_extra">
Vipul.</div><div class="gmail_extra"><br></div></div><span><font color="#888888">
</font></span></blockquote></div><span><font color="#888888"><br><br clear="all"><div><br></div>-- <br>Regards,<br>Peter Teoh
</font></span></div></div>
<br></div></div><div class="im">_______________________________________________<br>
Kernelnewbies mailing list<br>
<a href="mailto:Kernelnewbies@kernelnewbies.org" target="_blank">Kernelnewbies@kernelnewbies.org</a><br>
<a href="http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies" target="_blank">http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies</a><br>
<br></div></blockquote></div><br></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Regards,<br>Peter Teoh
</div></div>