What does inconsistent lock state mean?

Srivatsa Bhat bhat.srivatsa at gmail.com
Thu Dec 8 13:49:52 EST 2011


On Thu, Dec 8, 2011 at 11:50 PM, Kai Meyer <kai at gnukai.com> wrote:

> **
> On 12/08/2011 07:47 AM, Srivatsa Bhat wrote:
>
>  2 things:
> 1. Documentation/lockdep-design.txt explains the "cryptic lock state
> symbols".
> 2. Please post the lockdep splat _exactly_ as it appears, and _in full_
>     (and without line-wrapping, if possible). Usually lockdep is
> intelligent
>     enough to tell you the possible scenario that would lock up your
> system.
>     That gives a very good clue, in case you find it difficult to make out
> what
>      is wrong from the cryptic symbols.
>
>  Regards,
> Srivatsa S. Bhat
>
>
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
> Oh, sorry. I suppose I only included things that made any sense to me. If
> I were to hazard a guess after reading through the design doc, it sounds
> like there's a problem with the context in which locks are being acquired?
> That seems odd to me, since I don't get the inconsistent lock state until
> I'm trying to spin_unlock &sblsnap_snapshot_table[i].sblsnap_lock (which
> is why I assume it's listed as one that's currently held.
>
>
> Dec  7 15:52:20 dev2 kernel: =================================
> Dec  7 15:52:20 dev2 kernel: [ INFO: inconsistent lock state ]
> Dec  7 15:52:20 dev2 kernel: 2.6.32-220.el6.x86_64.debug #1
> Dec  7 15:52:20 dev2 kernel: ---------------------------------
> Dec  7 15:52:20 dev2 kernel: inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W}
> usage.
> Dec  7 15:52:20 dev2 kernel: tee/1966 [HC0[0]:SC0[0]:HE1:SE1] takes:
> Dec  7 15:52:20 dev2 kernel: (&vblk->lock){?.-...}, at:
> [<ffffffffa04c4d2a>] sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: {IN-HARDIRQ-W} state was registered at:
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff810afbca>]
> __lock_acquire+0x77a/0x1570
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff810b0a64>] lock_acquire+0xa4/0x120
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff81520c75>]
> _spin_lock_irqsave+0x55/0xa0
> Dec  7 15:52:20 dev2 kernel:  [<ffffffffa006b19b>] blk_done+0x2b/0x110
> [virtio_blk]
> Dec  7 15:52:20 dev2 kernel:  [<ffffffffa00401dc>]
> vring_interrupt+0x3c/0xd0 [virtio_ring]
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff810ec080>]
> handle_IRQ_event+0x50/0x160
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff810ee840>]
> handle_edge_irq+0xe0/0x170
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff8100e059>] handle_irq+0x49/0xa0
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff81526cdc>] do_IRQ+0x6c/0xf0
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff8100ba93>] ret_from_intr+0x0/0x16
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff810148e2>] default_idle+0x52/0xc0
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff81009e0b>] cpu_idle+0xbb/0x110
> Dec  7 15:52:20 dev2 kernel:  [<ffffffff81516623>]
> start_secondary+0x211/0x254
> Dec  7 15:52:20 dev2 kernel: irq event stamp: 4699
> Dec  7 15:52:20 dev2 kernel: hardirqs last  enabled at (4699):
> [<ffffffff81179e81>] __kmalloc+0x241/0x330
> Dec  7 15:52:20 dev2 kernel: hardirqs last disabled at (4698):
> [<ffffffff81179d60>] __kmalloc+0x120/0x330
> Dec  7 15:52:20 dev2 kernel: softirqs last  enabled at (4696):
> [<ffffffff81076baa>] __do_softirq+0x14a/0x200
> Dec  7 15:52:20 dev2 kernel: softirqs last disabled at (4681):
> [<ffffffff8100c30c>] call_softirq+0x1c/0x30
> Dec  7 15:52:20 dev2 kernel:
> Dec  7 15:52:20 dev2 kernel: other info that might help us debug this:
> Dec  7 15:52:20 dev2 kernel: 1 lock held by tee/1966:
> Dec  7 15:52:20 dev2 kernel: #0:
> (&sblsnap_snapshot_table[i].sblsnap_lock){+.+.+.}, at: [<ffffffffa04c4b7c>]
> sblsnap_snap_now+0xac/0x2a0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel:
> Dec  7 15:52:20 dev2 kernel: stack backtrace:
> Dec  7 15:52:20 dev2 kernel: Pid: 1966, comm: tee Not tainted
> 2.6.32-220.el6.x86_64.debug #1
> Dec  7 15:52:20 dev2 kernel: Call Trace:
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810ad947>] ?
> print_usage_bug+0x177/0x180
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810ae8ed>] ? mark_lock+0x35d/0x430
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810afa59>] ?
> __lock_acquire+0x609/0x1570
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810ab31d>] ?
> trace_hardirqs_off+0xd/0x10
> Dec  7 15:52:20 dev2 kernel: [<ffffffff815208e7>] ?
> _spin_unlock_irqrestore+0x67/0x80
> Dec  7 15:52:20 dev2 kernel: [<ffffffff8106ec43>] ?
> release_console_sem+0x203/0x250
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810b0a64>] ? lock_acquire+0xa4/0x120
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ?
> sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffff81520af6>] ? _spin_lock+0x36/0x70
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ?
> sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4d2a>] ?
> sblsnap_snap_now+0x25a/0x2a0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4de4>] ?
> sblsnap_patch_blkdev+0x74/0x120 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4eaf>] ?
> sblsnap_get_snapshot+0x1f/0x60 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c4f59>] ?
> sblsnap_create_snapshot+0x69/0x120 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffffa04c53cb>] ?
> sblsnap_config_write+0x26b/0x2c0 [sblsnap]
> Dec  7 15:52:20 dev2 kernel: [<ffffffff81200343>] ?
> proc_file_write+0x73/0xb0
> Dec  7 15:52:20 dev2 kernel: [<ffffffff812002d0>] ?
> proc_file_write+0x0/0xb0
> Dec  7 15:52:20 dev2 kernel: [<ffffffff811f97c5>] ?
> proc_reg_write+0x85/0xc0
> Dec  7 15:52:20 dev2 kernel: [<ffffffff81193318>] ? vfs_write+0xb8/0x1a0
> Dec  7 15:52:20 dev2 kernel: [<ffffffff810e68b2>] ?
> audit_syscall_entry+0x272/0x2a0
> Dec  7 15:52:20 dev2 kernel: [<ffffffff81193d21>] ? sys_write+0x51/0x90
> Dec  7 15:52:20 dev2 kernel: [<ffffffff8100b0b2>] ?
> system_call_fastpath+0x16/0x1b
>
>
The line print_usage_bug_scenario(this); seems to be missing in your
kernel's print_usage_bug()
function in kernel/lockdep.c If it was there (it is there in newer
kernels), it would have showed
pictorially what is the problem, and then most likely you wouldn't have to
look at anything else
to figure out what's the problem! That's the reason I asked for full
output, but alas your lockdep
seems to be old.

Though, what exactly is the problem is not immediately apparent to me, I
would certainly
suggest using spin_lock_irqsave() and spin_unlock_irqrestore() functions,
instead of just using
the spin_lock() and spin_unlock() versions, whenever you are dealing with
locks that have the
possibility of being acquired by interrupt contexts (i.e., while servicing
interrupts).

Regards,
Srivatsa S. Bhat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20111209/a65c505d/attachment-0001.html 


More information about the Kernelnewbies mailing list