How to debug stuck read?

Sun Feb 6 20:06:02 EST 2022

On Mon, Feb 07, 2022 at 02:07:47AM +0200, Dāvis Mosāns wrote:
> > > I think there should be a way to see which locks (and by who/where)
> > > have been taken for a long time.
> >
> > Well ... we do, but the problem is that the page lock is a single bit.
> > We just don't have the space in struct page for a pointer to a stack
> > trace.  So the page lock isn't like a spinlock or a mutex where we can
> > use the LOCKDEP infrastructure to tell us this kind of thing.
> >
> > Also, in this case, we know exactly where the lock was taken and by whom
> > -- and it would be no more information than you had from the stack trace.
> 
> The issue here is that you have a stuck task that doesn't have any
> crash/stack trace. The process itself is waiting in
> folio_wait_bit_common but I need to find the other side of it.

Right, but what you're asking for won't help find the other side.
It's just an automated way to find the side you did find.

> > kmap() doesn't lock the page; it's already locked at this point.
> > But if the memcpy() does crash then you're right, the page will never
> > be unlocked because it was this thread's job to unlock it when the page
> > was uptodate.  The thread will be dead, so there's no way to find it.
> > Do we not dump the thread's stack on its death?
> 
> Yeah there was, but as I said it happens only once per boot. So you
> have one (potentially old) crash/stacktrace but many stuck processes
> with no clear cause. Eg. you get crash and stuck process, kill
> process. Then days later you try reading that file again and it will
> get stuck but there won't be stacktrace as it won't reach that memcpy
> anymore.

I can't think of a way to solve that.  We can't know whether a dying task
"was going to" unlock a page.  So we have a locked page in the page cache
that nobody will ever unlock.  We can't remove it, because we don't know
that task died.  We can't start I/O on it again, because it looks like
I/O is already in progress.

I think the only answer is "Don't ignore stack dumps in dmesg".