Best way to debug an invalid opcode

Karaoui mohamed lamine moharaka at gmail.com
Thu Mar 19 05:22:25 EDT 2020


Hi list,

I am currently encountering a kernel oops that indicate an "invalid opcode:
0000 [#1] SMP"

I am working on this project https://github.com/GiantVM/Linux-DSM

The full log of the bug can be found here:
https://github.com/GiantVM/Linux-DSM/pull/3 (at the end)

Here is a snippet of the log:

[  107.980285] ------------[ cut here ]------------
[  107.980995] kernel BUG at arch/x86/kvm/dsm-util.c:214!
[  107.981706] invalid opcode: 0000 [#1] SMP
[  107.982423] Modules linked in: ccm arc4 iwlmvm joydev mac80211
mei_wdt hid_alps snd_soc_skl snd_soc_skl_ipc snd_hda_codec_hdmi
snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_codec_realtek snd_hda_ext_core
snd_hda_codec_generic dell_wmi snd_soc_sst_match dell_smbios
snd_soc_core snd_compress snd_pcm_dmaengine ac97_bus dcdbas
dell_smm_hwmon iwlwifi snd_hda_intel snd_hda_codec cfg80211 intel_rapl
snd_hda_core x86_pkg_temp_thermal intel_powerclamp coretemp snd_hwdep
snd_pcm snd_timer snd input_leds serio_raw soundcore idma64 virt_dma
hci_uart uvcvideo mei_me btusb mei videobuf2_vmalloc btrtl
videobuf2_memops videobuf2_v4l2 videobuf2_core btbcm videodev btqca
btintel bluetooth media intel_pch_thermal intel_lpss_pci
processor_thermal_device intel_soc_dts_iosf acpi_als kfifo_buf
intel_hid int3400_thermal intel_lpss_acpi
[  107.985763]  acpi_pad intel_lpss dell_smo8800 int3403_thermal
industrialio acpi_thermal_rel mac_hid int340x_thermal_zone
sparse_keymap iscsi_tcp libiscsi_tcp autofs4 btrfs raid10 raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor
raid6_pq libcrc32c raid1 raid0 multipath linear i2c_hid i915
crct10dif_pclmul drm_kms_helper crc32_pclmul syscopyarea sysfillrect
sysimgblt ghash_clmulni_intel fb_sys_fops drm aesni_intel e1000e
aes_x86_64 lrw glue_helper ablk_helper cryptd ahci libahci wmi hid
pinctrl_sunrisepoint pinctrl_intel video fjes
[  107.990389] CPU: 1 PID: 1592 Comm: qemu-system-x86 Not tainted 4.9.76+ #5
[  107.991368] Hardware name: Dell Inc. Latitude 5280/08DMYJ, BIOS
1.9.3 03/08/2018
[  107.992346] task: ffff92fa7e1f9700 task.stack: ffffa2dc837e4000
[  107.993310] RIP: 0010:[<ffffffffb606794e>]  [<ffffffffb606794e>]
dsm_decode_diff+0xbe/0xd0
[  107.994310] RSP: 0018:ffffa2dc837e7b50  EFLAGS: 00010286
[  107.995310] RAX: 00000000ffffffff RBX: ffff92fa66090000 RCX: 0000000000001000
[  107.996294] RDX: ffff92fa66096000 RSI: 0000000000000001 RDI: ffff92fa66090000
[  107.997297] RBP: ffffa2dc837e7b78 R08: ffffa2dc837e8000 R09: 0000000000001000
[  107.998292] R10: 0000000000001000 R11: 0000000000000030 R12: ffff92fa66096000
[  107.999305] R13: ffffa2dc85af31c0 R14: 0000000000078948 R15: 0000000000000001
[  108.000303] FS:  00007f92cfbff700(0000) GS:ffff92fab1480000(0000)
knlGS:0000000000000000
[  108.001326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  108.002316] CR2: 00007f5c3c3ca688 CR3: 00000007fdc84000 CR4: 0000000000362670
[  108.003333] Stack:
[  108.004354]  00000007f9348548 ffffa2dc85b09000 0000000000078948
ffffa2dc85af31c0
[  108.005379]  ffff92fa7dc98000 ffffa2dc837e7bf8 ffffffffb6069713
0000000100000000
[  108.006427]  000000017e1fa940 ffff92fa66090000 0000000000000000
0000000000000000
[  108.007480] Call Trace:
[  108.008508]  [<ffffffffb6069713>] ivy_kvm_dsm_page_fault+0x573/0x840
[  108.009572]  [<ffffffffb60645e1>] kvm_dsm_page_fault+0x71/0xd0
[  108.010633]  [<ffffffffb606639c>] kvm_dsm_memcpy+0x33c/0x5c0
[  108.011667]  [<ffffffffb6066c76>] kvm_vm_ioctl_dsm+0xc6/0x360
[  108.012721]  [<ffffffffb6035ef9>] kvm_arch_vm_ioctl+0x929/0xbf0
[  108.013767]  [<ffffffffb611b605>] ? update_load_avg+0x75/0x390
[  108.014806]  [<ffffffffb611b605>] ? update_load_avg+0x75/0x390
[  108.015827]  [<ffffffffb60219ea>] kvm_vm_ioctl+0x8a/0x7c0
[  108.016828]  [<ffffffffb60957c5>] ? __switch_to+0x2e5/0x700
[  108.017842]  [<ffffffffb62a2412>] do_vfs_ioctl+0x92/0x5b0
[  108.018830]  [<ffffffffb68f9e12>] ? __sys_recvmsg+0x62/0x80
[  108.019823]  [<ffffffffb62a29a9>] SyS_ioctl+0x79/0x90
[  108.020806]  [<ffffffffb6a3159e>] entry_SYSCALL_64_fastpath+0x1e/0xc9
[  108.021801] Code: 0f 00 00 48 29 fb 48 29 de 81 c3 00 10 00 00 c1
eb 03 89 d9 f3 48 a5 4c 89 e7 e8 3e f8 1f 00 5b 41 5c 41 5d 41 5e 41
5f 5d f3 c3 <0f> 0b 0f 0b 0f 1f 40 00 66 2e 0f 1f 84 00 00 00 00 00 0f
1f 44
[  108.023980] RIP  [<ffffffffb606794e>] dsm_decode_diff+0xbe/0xd0
[  108.025003]  RSP <ffffa2dc837e7b50>


AFAIK, this is a memory corruption problem. There maybe a function that
rewrite the kernel code and set some instruction to 0x0?

Ideas?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20200319/7f591882/attachment-0001.html>


More information about the Kernelnewbies mailing list