How to debug stuck read?
Dāvis Mosāns
davispuh at gmail.com
Wed Feb 2 12:15:14 EST 2022
Hi,
I have a corrupted file on BTRFS which has CoW disabled thus no
checksum. Trying to read this file causes the process to get stuck
forever. It doesn't return EIO.
How can I find out why it gets stuck?
$ ddrescue -b 1 currupted_file /tmp/temp
GNU ddrescue 1.26
Press Ctrl-C to interrupt
ipos: 0 B, non-trimmed: 0 B, current rate: 0 B/s
opos: 0 B, non-scraped: 0 B, average rate: 0 B/s
non-tried: 8388 kB, bad-sector: 0 B, error rate: 0 B/s
rescued: 0 B, bad areas: 0, run time: 0s
pct rescued: 0.00%, read errors: 0, remaining time: n/a
time since last successful read: n/a
Copying non-tried blocks... Pass 1 (forwards)
^C
// doesn't stop with Ctrl+C nor SIGTERM
$ gdb -q -p 3449
Attaching to process 3449
^C
// same gets stuck
$ cat /proc/3449/stack | ./scripts/decode_stacktrace.sh vmlinux
folio_wait_bit_common (mm/filemap.c:1314)
filemap_get_pages (mm/filemap.c:2622)
filemap_read (mm/filemap.c:2676)
new_sync_read (fs/read_write.c:401 (discriminator 1))
vfs_read (fs/read_write.c:481)
ksys_read (fs/read_write.c:619)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
I enabled
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_LOCK_DEBUGGING_SUPPORT=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_LOCK_ALLOC=y
but in dmesg only thing that shows up is a lot of
BTRFS error (device sdh): invalid lzo header, lzo len 2937060802
compressed len 4096
If I try to do btrfs send, it gets stuck same way
$ btrfs send -v /mnt/fs > /dev/null
$ cat /proc/4712/stack | ./scripts/decode_stacktrace.sh vmlinux
folio_wait_bit_common (mm/filemap.c:1314)
__filemap_get_folio (mm/filemap.c:1690 ./include/linux/pagemap.h:779
mm/filemap.c:1960)
pagecache_get_page (mm/folio-compat.c:126)
send_extent_data (fs/btrfs/send.c:4980 fs/btrfs/send.c:5048
fs/btrfs/send.c:5235) btrfs
process_extent (fs/btrfs/send.c:5575 fs/btrfs/send.c:5959) btrfs
btrfs_ioctl_send (fs/btrfs/send.c:6770 fs/btrfs/send.c:7368
fs/btrfs/send.c:7688) btrfs
_btrfs_ioctl_send (fs/btrfs/ioctl.c:4963) btrfs
btrfs_ioctl (fs/btrfs/ioctl.c:5072) btrfs
__x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
I'm now using 5.17.0-rc2 but it's exactly same with 5.16.5
Thanks!
Best regards,
Dāvis
More information about the Kernelnewbies
mailing list