kernel crash dump analysis using crash utility
amit mehta
gmate.amit at gmail.com
Mon Sep 19 02:56:26 EDT 2011
My Linux box just crashed while performing some network related tests.
I'm trying to analyze the kernel crash dump using "Crash" utility.
Need your help in analyzing it.
<<<Snip from crash output>>>
# crash /usr/lib/debug/lib/modules/2.6.32-131.0.15.el6.x86_64/vmlinux vmcore
.........versioning information here...............
.................etc etc..........................
KERNEL: /usr/lib/debug/lib/modules/2.6.32-131.0.15.el6.x86_64/vmlinux
DUMPFILE: vmcore [PARTIAL DUMP]
CPUS: 32
DATE: Mon Sep 19 09:44:45 2011 <---- ---
UPTIME: 2 days, 22:55:44
LOAD AVERAGE: 0.00, 0.02, 0.00
TASKS: 409
NODENAME: RHEL61ga
RELEASE: 2.6.32-131.0.15.el6.x86_64
VERSION: #1 SMP Tue May 10 15:42:40 EDT 2011
MACHINE: x86_64 (2660 Mhz)
MEMORY: 12 GB
PANIC: "Oops: 0000 [#1] SMP " (check log for details)
PID: 0
COMMAND: "swapper"
TASK: ffffffff81a2d020 (1 of 8) [THREAD_INFO: ffffffff81a00000]
CPU: 0
STATE: TASK_RUNNING (ACTIVE)
WARNING: panic task not found
........................ stack trace frame 0 - frame 18
here.....................
...............................................................................................
--- <IRQ stack> ---
#19 [ffffffff81a01da8] ret_from_intr at ffffffff8100bad3
[exception RIP: acpi_check_resource_conflict+207] < ---- --- --------
RIP: ffffffff812bb86e RSP: ffffffff81a01e58 RFLAGS: 00000206
RAX: 0000000000000000 RBX: ffffffff81a01ec8 RCX: 0000000000000000
RDX: 0000000000000006 RSI: 0000000000000000 RDI: 00000000000018f6
RBP: ffffffff8100bace R8: 0000000000000000 R9: 0000000000000ca3
R10: ffff88019aa061c2 R11: ffff88019aa06201 R12: ffffffff81b7c0b8
R13: 0000000000000001 R14: ffffffff810ece03 R15: ffffffff81a01dd8
ORIG_RAX: ffffffffffffffb5 CS: 0010 SS: 0018
#20 [ffffffff81a01e50] acpi_check_resource_conflict at ffffffff812bb851
#21 [ffffffff81a01ed0] show_current_driver at ffffffff813eccb7
#22 [ffffffff81a01ef0] cpu_idle at ffffffff81009e96
<<<Snip from crash output>>>
Observation:
i) The panic hit at Sep 19 09:44:45 2011, and a peek inside /var/log/messages
reveals that the machine was down for about 4-5 minutes around this timestamp.
But there are no other relevant information available in the log file, so as
printed in the panic string above:
"Oops: 0000 [#1] SMP " (check log for details)",
where else should i look for logs ?
ii) Does frame 19:: [exception RIP: acpi_check_resource_conflict+207] and the
immediated dump of the registers of my machine(Mine is x86_64) is an indication
of where the Oops occured ?
iii) Disassembly of "acpi_check_resource_conflict" at an offset:207
shows a 'test'
instruction being carried out on CPU 0:
0xffffffff812bb86e <acpi_check_resource_conflict+207>: test
%ebx,0xc4c3e8(%rip) # 0xffffffff81f07c5c
iv) I belive that on 64 bit x86 machines, RIP represents EIP and
similarly RBX represents EBX register.
hence at the time of panic, the contents fo these registers were as follows:
RBX = ffffffff81a01ec8
RIP = ffffffff812bb86e
I don't know much of assembly, just did little bit of search on web, and my
understanding is that the instruction "test %ebx,0xc4c3e8(%rip)" tells the
processor to do the bitwise "and" of contents of ebx register with the value
at the offset:0xc4c3e8 from the base address pointed by rip.
i.e. ffffffff812bb86e+c4c3e8 = FFFFFFFF81F07C56 ? Please correct me if
I'm wrong.
-Amit
More information about the Kernelnewbies
mailing list