Tips for Kernel Module Debugging

Lucas Tanure tanure at linux.com
Sat Sep 12 15:41:39 EDT 2015


On Sat, Sep 12, 2015 at 4:27 PM, <Valdis.Kletnieks at vt.edu> wrote:

> On Sat, 12 Sep 2015 16:04:43 -0300, Lucas Tanure said:
>
> > I'm testing the linux-next tree and I got this stack:
> >
> > [    2.158054] Call Trace:
> > [    2.158058]  [<ffffffff812b9159>] dump_stack+0x4b/0x72
> > [    2.158061]  [<ffffffff81074e62>] warn_slowpath_common+0x82/0xc0
> > [    2.158063]  [<ffffffff81074faa>] warn_slowpath_null+0x1a/0x20
> > [    2.158066]  [<ffffffffa0572291>] drm_dev_alloc+0x251/0x320 [drm]
> > [    2.158070]  [<ffffffffa0574d0b>] drm_get_pci_dev+0x3b/0x1e0 [drm]
> > [    2.158081]  [<ffffffffa07062d4>] i915_pci_probe+0x34/0x50 [i915]
> >
> > How is the best way to debug this ? I really need to add a print, compile
> > and boot many times ?
> > How would you guys debug this ?
>
> Step 0:  Include the last few lines *before* the Call Trace - if indeed
> it was a Warning, it will give you the file and line number of where the
> WARN_ON was..
>
> [26636.029711] ------------[ cut here ]------------
> [26636.029724] WARNING: CPU: 3 PID: 19157 at
> ./arch/x86/include/asm/thread_info.h:239 sigsuspend+0xa4/0xb0()
>
> Bummer of a birthmark, Hal.  The one my laptop hit was a WARN_ON inside
> either a macro or static inline from that .h file. Fortunately, yours
> was inside a .c file and pointed in the right place (see below for how
> I know that...)
>
> That 'cut here' is where you should start the cut-n-paste, and include
> everything down to 'end trace'.
>
> Having said that, looking at drivers/gpu/drm/drm_drv.c:drm_dev_alloc() we
> find only one WARN_ON:
>
>         if (drm_core_check_feature(dev, DRIVER_MODESET)) {
>                 ret = drm_minor_alloc(dev, DRM_MINOR_CONTROL);
>                 if (ret)
>                         goto err_minors;
>
>                 WARN_ON(driver->suspend || driver->resume);
>         }
>
> As to *why* that one triggered, you'll have to ask an actual i915 expert.
>

Hi,

My full warning:

------------[ cut here ]------------
WARNING: CPU: 3 PID: 243 at drivers/gpu/drm/drm_drv.c:569
drm_dev_alloc+0x251/0x320 [drm]()
Modules linked in: i915(+) joydev input_leds mousedev intel_rapl iosf_mbi
x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm ttm
hid_generic drm_kms_helper crct10dif_pclmul snd_hda_intel crc32_pclmul
usbhid snd_hda_codec crc32c_intel drm hid ghash_clmulni_intel snd_hda_core
eeepc_wmi asus_wmi aesni_intel iTCO_wdt sparse_keymap snd_hwdep led_class
aes_x86_64 lrw snd_pcm iTCO_vendor_support rfkill mxm_wmi evdev gf128mul
intel_gtt e1000e glue_helper mac_hid snd_timer syscopyarea ablk_helper
cryptd sysfillrect psmouse snd sysimgblt pcspkr fb_sys_fops ptp mei_me
i2c_i801 i2c_algo_bit soundcore mei shpchp i2c_core pps_core lpc_ich
serio_raw wmi fan battery processor thermal video button sch_fq_codel
ip_tables x_tables ext4 crc16 mbcache jbd2 sd_mod atkbd libps2 ahci libahci
libata
 xhci_pci xhci_hcd ehci_pci ehci_hcd scsi_mod usbcore usb_common i8042 serio
CPU: 3 PID: 243 Comm: systemd-udevd Not tainted 4.2.0-next-20150912-ARCH #5
Hardware name: System manufacturer System Product Name/Maximus IV GENE-Z,
BIOS 3603 11/09/2012
 0000000000000000 000000005ca47666 ffff88060f70b9d0 ffffffff812b9159
 0000000000000000 ffff88060f70ba08 ffffffff81074e62 ffff880612d39000
 ffffffffa06c7100 ffff880612f66098 ffffffffa06c7100 ffffffffa0691760
Call Trace:
 [<ffffffff812b9159>] dump_stack+0x4b/0x72
 [<ffffffff81074e62>] warn_slowpath_common+0x82/0xc0
 [<ffffffff81074faa>] warn_slowpath_null+0x1a/0x20
 [<ffffffffa0422291>] drm_dev_alloc+0x251/0x320 [drm]
 [<ffffffffa0424d0b>] drm_get_pci_dev+0x3b/0x1e0 [drm]
 [<ffffffffa05dd2d4>] i915_pci_probe+0x34/0x50 [i915]
 [<ffffffff812fdec5>] local_pci_probe+0x45/0xa0
 [<ffffffff812fde10>] ? pci_match_device+0xe0/0x110
 [<ffffffff812ff053>] pci_device_probe+0x103/0x150
 [<ffffffff813d7942>] driver_probe_device+0x222/0x490
 [<ffffffff813d7c34>] __driver_attach+0x84/0x90
 [<ffffffff813d7bb0>] ? driver_probe_device+0x490/0x490
 [<ffffffff813d557c>] bus_for_each_dev+0x6c/0xc0
 [<ffffffff813d70fe>] driver_attach+0x1e/0x20
 [<ffffffff813d6c4b>] bus_add_driver+0x1eb/0x280
 [<ffffffff813d8540>] driver_register+0x60/0xe0
 [<ffffffff812fd73c>] __pci_register_driver+0x4c/0x50
 [<ffffffffa0424f90>] drm_pci_init+0xe0/0x110 [drm]
 [<ffffffffa06e6000>] ? 0xffffffffa06e6000
 [<ffffffffa06e60a4>] i915_init+0xa4/0xab [i915]
 [<ffffffff81002123>] do_one_initcall+0xb3/0x200
 [<ffffffff81199801>] ? __vunmap+0x91/0xe0
 [<ffffffff811589a0>] do_init_module+0x5f/0x1ef
 [<ffffffff810fa707>] load_module+0x2197/0x27e0
 [<ffffffff810f7550>] ? symbol_put_addr+0x50/0x50
 [<ffffffff81188695>] ? __pte_alloc_kernel+0xa5/0xf0
 [<ffffffff810fae9e>] SyS_init_module+0x14e/0x190
 [<ffffffff8157046e>] entry_SYSCALL_64_fastpath+0x12/0x71
---[ end trace d2652104b24a32ff ]---

I could see that the real problem is  drm_dev_alloc, because it's the
function just before the warring warn_slowpath_null. And this
warn_slowpath_null function is what prints the warn.
So how I can debug this ?

Thanks!

--
Lucas Tanure
+55 (19) 988176559
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150912/235b0932/attachment.html 


More information about the Kernelnewbies mailing list