kernel crash dump analysis using crash utility

amit mehta gmate.amit at gmail.com
Mon Sep 19 02:56:26 EDT 2011


My Linux box just crashed while performing some network related tests.
I'm trying to analyze the kernel crash dump using "Crash" utility.
Need your help in analyzing it.

<<<Snip from crash output>>>
# crash /usr/lib/debug/lib/modules/2.6.32-131.0.15.el6.x86_64/vmlinux vmcore
.........versioning information here...............
.................etc etc..........................
      KERNEL: /usr/lib/debug/lib/modules/2.6.32-131.0.15.el6.x86_64/vmlinux
    DUMPFILE: vmcore  [PARTIAL DUMP]
        CPUS: 32
        DATE: Mon Sep 19 09:44:45 2011   <---- ---
      UPTIME: 2 days, 22:55:44
LOAD AVERAGE: 0.00, 0.02, 0.00
       TASKS: 409
    NODENAME: RHEL61ga
     RELEASE: 2.6.32-131.0.15.el6.x86_64
     VERSION: #1 SMP Tue May 10 15:42:40 EDT 2011
     MACHINE: x86_64  (2660 Mhz)
      MEMORY: 12 GB
       PANIC: "Oops: 0000 [#1] SMP " (check log for details)
         PID: 0
     COMMAND: "swapper"
        TASK: ffffffff81a2d020  (1 of 8)  [THREAD_INFO: ffffffff81a00000]
         CPU: 0
       STATE: TASK_RUNNING (ACTIVE)
     WARNING: panic task not found

........................ stack trace frame 0 - frame 18
here.....................
...............................................................................................

--- <IRQ stack> ---
#19 [ffffffff81a01da8] ret_from_intr at ffffffff8100bad3
    [exception RIP: acpi_check_resource_conflict+207] < ---- --- --------
    RIP: ffffffff812bb86e  RSP: ffffffff81a01e58  RFLAGS: 00000206
    RAX: 0000000000000000  RBX: ffffffff81a01ec8  RCX: 0000000000000000
    RDX: 0000000000000006  RSI: 0000000000000000  RDI: 00000000000018f6
    RBP: ffffffff8100bace   R8: 0000000000000000   R9: 0000000000000ca3
    R10: ffff88019aa061c2  R11: ffff88019aa06201  R12: ffffffff81b7c0b8
    R13: 0000000000000001  R14: ffffffff810ece03  R15: ffffffff81a01dd8
    ORIG_RAX: ffffffffffffffb5  CS: 0010  SS: 0018
#20 [ffffffff81a01e50] acpi_check_resource_conflict at ffffffff812bb851
#21 [ffffffff81a01ed0] show_current_driver at ffffffff813eccb7
#22 [ffffffff81a01ef0] cpu_idle at ffffffff81009e96

<<<Snip from crash output>>>

Observation:
i) The panic hit at Sep 19 09:44:45 2011, and a peek inside /var/log/messages
reveals that the machine was down for about 4-5 minutes around this timestamp.
But there are no other relevant information available in the log file, so as
printed in the panic string above:
"Oops: 0000 [#1] SMP " (check log for details)",
where else should i look for logs ?

ii) Does frame 19:: [exception RIP: acpi_check_resource_conflict+207] and the
immediated dump of the registers of my machine(Mine is x86_64) is an indication
of where the Oops occured ?

iii) Disassembly of "acpi_check_resource_conflict" at an offset:207
shows a 'test'
instruction being carried out on CPU 0:

0xffffffff812bb86e <acpi_check_resource_conflict+207>:  test
%ebx,0xc4c3e8(%rip) # 0xffffffff81f07c5c

iv) I belive that on 64 bit x86 machines, RIP represents EIP and
similarly RBX represents EBX register.
hence at the time of panic, the contents fo these registers were as follows:

RBX = ffffffff81a01ec8
RIP = ffffffff812bb86e

I don't know much of assembly, just did little bit of search on web, and my
understanding is that the instruction "test   %ebx,0xc4c3e8(%rip)" tells the
processor to do the bitwise "and" of contents of ebx register with the value
at the offset:0xc4c3e8 from the base address pointed by rip.
i.e. ffffffff812bb86e+c4c3e8 = FFFFFFFF81F07C56 ? Please correct me if
I'm wrong.

-Amit



More information about the Kernelnewbies mailing list