side effects of calling interruptible_sleep_on_timeout()
Srivatsa S. Bhat
srivatsa.bhat at linux.vnet.ibm.com
Thu Apr 26 04:03:46 EDT 2012
On 04/26/2012 10:03 AM, Arun KS wrote:
> Hi Srivatsa,
>
> On Wed, Apr 25, 2012 at 3:56 PM, Srivatsa S. Bhat
> <srivatsa.bhat at linux.vnet.ibm.com> wrote:
>> On 04/25/2012 03:36 AM, Philipp Ittershagen wrote:
>>
>>> Hi Devendra,
>>>
>>> On Tue, Apr 24, 2012 at 03:24:23PM +0530, devendra rawat wrote:
>>>> Hi,
>>>> A switch driver is causing soft lockup on Montavista Linux Kernel
>>>> 2.6.10 system.
>>>> While browsing through the code of the driver. I came across a snippet
>>>> where after disabling the interrupts
>>>> a call is made to interruptible_sleep_on_timeout().
>>>> The code snippet is like
>>>> cli();
>>>> init_waitqueue_head(&queue);
>>>> interruptible_sleep_on_timeout(&queue, USEC_TO_JIFFIES(usec));
>>>> thread_check_signals();
>>>> sti();
>>>> I need to know the side effect of this sort of code, can it be
>>>> responsible for the softlockup of the system ? Its a PowerPC based
>>>> system.
>>>
>>> you cannot call sleep functions after disabling interrupts, because no
>>> interrupt will arrive for the scheduler to see the timeout and resume your
>>> task.
>>>
>>
>>
>> Yes, that's right. Also, in general, sleeping inside atomic sections (eg.,
>> sections with interrupts disabled or preempt disabled) is wrong. There is a
>> config option in the kernel that you can use to enable
>> sleep-inside-atomic-section-checking (CONFIG_DEBUG_ATOMIC_SLEEP I believe),
>> which can help you pin-point such bugs easily.
>
> I tired an experiment to check this.
>
> /* disable interrupts and preemption */
> spin_lock_irqsave(&lock, flags);
> /* enable preemption, but interrupt still disabled */
> spin_unlock(&lock);
> /* Now schedule something else */
> schedule_timeout(10 * HZ);
>
> But this is not causing any harm. I m able to call schedule with
> interrupt disabled and system works fine afterwards.
>
> So when I looked inside the schedule() function, it checks only
> whether preemption is disabled or not. schedule calls BUG() only if
> preemption is disabled and not if interrupts are disabled.
>
> And AFAIK there is no fuction inside the kernel which tells you that
> interrupt are disabled.
>
> So explantion why system works fine after calling a schedule with
> interrupt disabled go here,
>
> There is a raw_spin_lock_irq(&rq->lock) inside the __schedule() which
> in turn calls local_irq_disable().
>
> local_irq_disable/enable() functions are not nested. We dont have
> reference counting.
> One call to local_irq_enable is enough to enable multiple calls of
> local_irq_disable().
>
> So my inference is that if you call a schedule with interrupt disable
> will not cause any problem. Because schedule function enable it back
> before we really schedules out.
> But call to schedule() with preemtion disabled will end up in famous
> BUG scheduling while atomic.
>
Indeed, you are right! And your experiment and analysis is perfect too!
Sorry for the confusion - I had used the term "atomic" quite loosely. But
your careful experiment of just re-enabling preemption, while still keeping
the interrupts disabled was a very good one! And to add to what you said
above, the __schedule() also does a preempt_enable() to re-enable preemption
(which it had disabled at the beginning). But since preempt_disable() can
nest, if we had called __schedule() with preemption already disabled, then
we end up in trouble - and hence the BUG is fired in such cases.
Thanks for the clarification!
Regards,
Srivatsa S. Bhat
More information about the Kernelnewbies
mailing list