Possible Bug

Fri Apr 1 09:00:08 EDT 2016

On Thu, Mar 31, 2016 at 11:41 PM, nick <xerofoify at gmail.com> wrote:
>
>
> On 2016-03-31 04:22 PM, Roger H Newell wrote:
>> On Thu, Mar 31, 2016 at 4:53 PM,  <Valdis.Kletnieks at vt.edu> wrote:
>>> On Thu, 31 Mar 2016 15:46:51 -0230, Roger H Newell said:
>>>
>>>> I had a look inside the .config I used to compile this kernel.
>>>> I think I found the information you're looking for.
>>>>
>>>> # CONFIG_KASAN is not set
>>>> # CONFIG_SLAB is not set
>>>> CONFIG_SLUB=y
>>>> # CONFIG_SLOB is not set
>>>
>>> Well, that cuts down on the amount of code that needs to be stared at.
>>>
>>> I don't suppose we get extra-ordinarily lucky and the system was set up to
>>> do crash dumps, was it?
>>>
>>> I've spent a few more minutes looking at the relevant code, and the more I
>>> stare at it, the more I understand why we see the same stack trace in varied
>>> forums going back over a year - it looks like it only craps out if something
>>> during resume or hotplug or similar processing stomps on memory, and the next
>>> call to apparmor_file_alloc_security() has to allocate a new slab.
>>>
>>> Or more correctly, it only dies with *this* traceback under those conditions.
>>> If something else is next up to allocate a slab, it gets a different traceback.
>>>
>>>
>>
>> No it wasn't. There is a file
>> /var/crash/linux-image-4.5.0+.267545.crash. However, its basically the
>> same output that I pasted from dmesg. I've included it anyway in case
>> there are some hints in it.
>>
>> ProblemType: KernelOops
>> Annotation: Your system might become unstable now and might need to be
>> restarted.
>> Date: Thu Mar 31 12:29:19 2016
>> Failure: oops
>> OopsText:
>>  [961778.803501] BUG: unable to handle kernel NULL pointer dereference
>> at 0000000000000805
>>  [961778.809728] IP: [<ffffffff811e636b>] kmem_cache_alloc_trace+0x7b/0x1e0
>>  [961778.815943] PGD cea04067 PUD abb59067 PMD 0
>>  [961778.822149] Oops: 0000 [#3] SMP
>>  [961778.828328] Modules linked in: binfmt_misc snd_hda_codec_realtek
>> snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_codec
>> snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event
>> snd_rawmidi snd_seq snd_seq_device snd_timer edac_mce_amd snd joydev
>> kvm_amd input_leds edac_core kvm soundcore serio_raw k10temp i2c_piix4
>> 8250_fintek asus_atk0110 mac_hid irqbypass parport_pc ppdev lp parport
>> autofs4 pata_acpi hid_generic usbhid hid amdkfd amd_iommu_v2 radeon
>> i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt
>> fb_sys_fops drm psmouse ahci pata_atiixp libahci r8169 mii wmi
>>  [961778.849223] CPU: 2 PID: 23118 Comm: sign-file Tainted: G      D
>>       4.5.0+ #28
>>  [961778.856339] Hardware name: System manufacturer System Product
>> Name/M5A78L-M LX PLUS, BIOS 0402    09/20/2011
>>  [961778.863557] task: ffff88003dbdc100 ti: ffff88009ae3c000 task.ti:
>> ffff88009ae3c000
>>  [961778.870811] RIP: 0010:[<ffffffff811e636b>]  [<ffffffff811e636b>]
>> kmem_cache_alloc_trace+0x7b/0x1e0
>>  [961778.878175] RSP: 0018:ffff88009ae3fc70  EFLAGS: 00010206
>>  [961778.885522] RAX: 0000000000000000 RBX: 00000000024080c0 RCX:
>> 000000000bd44541
>>  [961778.892949] RDX: 000000000bd44540 RSI: 00000000024080c0 RDI:
>> 0000000000019b20
>>  [961778.900361] RBP: ffff88009ae3fcb0 R08: ffff88012fc99b20 R09:
>> ffff88012b003cc0
>>  [961778.907810] R10: 0000000000000805 R11: fefefefefefefeff R12:
>> 00000000024080c0
>>  [961778.915294] R13: ffffffff813736d3 R14: 00007f9b2ac8c040 R15:
>> ffff88012b003cc0
>>  [961778.922812] FS:  00007f8546f0a700(0000) GS:ffff88012fc80000(0000)
>> knlGS:0000000000000000
>>  [961778.930405] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>  [961778.937994] CR2: 0000000000000805 CR3: 00000000b9cdc000 CR4:
>> 00000000000006e0
>>  [961778.945445] Stack:
>>  [961778.952673]  ffffffff81214fef ffff88009ae3fccc 0000000000000002
>> ffff880002c28700
>>  [961778.960013]  ffff880002c28700 ffff88009ae3fef4 00007f9b2ac8c040
>> ffff88009ae3fde0
>>  [961778.967372]  ffff88009ae3fcc8 ffffffff813736d3 ffffffff81c9fe80
>> ffff88009ae3fce8
>>  [961778.974682] Call Trace:
>>  [961778.981902]  [<ffffffff81214fef>] ? lookup_fast+0x16f/0x320
>>  [961778.989161]  [<ffffffff813736d3>] apparmor_file_alloc_security+0x23/0x40
>>  [961778.996452]  [<ffffffff81335b53>] security_file_alloc+0x33/0x50
>>  [961779.003495]  [<ffffffff8120bb6a>] get_empty_filp+0x9a/0x1c0
>>  [961779.010284]  [<ffffffff812176ce>] path_openat+0x2e/0x1400
>>  [961779.016817]  [<ffffffff8121661a>] ? walk_component+0x3a/0x470
>>  [961779.023241]  [<ffffffff811dd0ee>] ? alloc_pages_vma+0xbe/0x240
>>  [961779.029590]  [<ffffffff8121a38e>] do_filp_open+0x7e/0xe0
>>  [961779.035858]  [<ffffffff81196d36>] ?
>> lru_cache_add_active_or_unevictable+0x36/0xb0
>>  [961779.042118]  [<ffffffff811b9163>] ? handle_mm_fault+0x1253/0x19e0
>>  [961779.048323]  [<ffffffff811e629a>] ? kmem_cache_alloc+0x17a/0x1d0
>>  [961779.054493]  [<ffffffff81227606>] ? __alloc_fd+0x46/0x190
>>  [961779.060674]  [<ffffffff81208984>] do_sys_open+0x124/0x210
>>  [961779.066821]  [<ffffffff81208a8e>] SyS_open+0x1e/0x20
>>  [961779.072981]  [<ffffffff817ec736>] entry_SYSCALL_64_fastpath+0x1e/0xa8
>>  [961779.079150] Code: 08 65 4c 03 05 3f 3e e2 7e 49 83 78 10 00 4d 8b
>> 10 0f 84 14 01 00 00 4d 85 d2 0f 84 0b 01 00 00 49 63 41 20 48 8d 4a
>> 01 49 8b 39 <49> 8b 1c 02 4c 89 d0 65 48 0f c7 0f 0f 94 c0 84 c0 74 bb
>> 49 63
>>  [961779.085893] RIP  [<ffffffff811e636b>] kmem_cache_alloc_trace+0x7b/0x1e0
>>  [961779.092359]  RSP <ffff88009ae3fc70>
>>  [961779.098773] CR2: 0000000000000805
>>  [961779.105231] ---[ end trace e7adb7015192b3a5 ]---
>>
>> Package: linux-image-4.5.0+ (not installed)
>> SourcePackage: linux
>> Tags: kernel-oops
>> Uname: Linux 4.5.0+ x86_64
>>
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies at kernelnewbies.org
>> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
> Roger,
> Are you able to accurately reproduce this error? If so I would very much like
> to see the output with KASAN enabled to see the actual memory regions being
> freed/allocated before the NULL deference.
> Nick
Nick:

No, I can't accurately reproduce the error. It doesn't happen every
time I plug in the USB stick, and it hasn't happened since the first
instance.

In terms of any further testing, you've had two shots at it already.
Frankly unless someone else can confirm you are on the right track, I
don't feel its a good use of my time to recompile and boot into my
kernel over and over. It was a fun ride at the time, but all rides
come to an end and this one ends here.

Best of luck:
Roger