x86 driver help: ever see DMA and MMIO operations NMI depending on which PCIe slot you're installed in?

Jimmie Mayfield jimmie at sackheads.org
Sun May 17 02:27:53 EDT 2015


Hi all.  We're in the midst of performing system compatibility tests 
with our device and I'm seeing some odd behavior when testing a Lenovo 
x3650 M4 and M5 machines.

1) On the x3650 M4 machine, an attempt to perform a TODEVICE DMA 
operation using consistent memory results in an NMI when the device 
attempts to fetch the memory.  Here's the thing:  this NMI only happens 
when the device is plugged into certain PCIe slots.  Some slots appear 
to work fine.

We obtained PCIe bus analyzer traces for the NMI scenario and sent to 
Lenovo for analysis.  Their response was a rather terse "memory address 
is invalid".  We later obtained an analyzer trace for a non-NMI scenario 
in a different slot and saw the very same bus address.  So I'm very 
confused:  is it possible for a bus address allocated via 
pci_alloc_consistent to be valid only for specific PCIe slots? 
Shouldn't the kernel be able to allocate valid memory since it's given 
the pci_dev * as an argument?

2) On the x3650 M5 machine, an attempt to perform MMIO operations 
results in an NMI.  Again, only from certain slots.  Some slots seem to 
work fine.  Again, near as I can tell, the slots are physically the same 
-- same width, same power capabilities, etc.

Again, we obtained PCIe bus analyzer traces for both the NMI and the 
non-NMI scenarios and compared.  There's a lot of noise in the traces 
because the machine BIOS appears to poll the PCI registers repeatedly 
and frequently but once the driver enables the device, we don't see 
anything that stands out in one case or the other.  In both traces, we 
see the device receive the MMIO read request and respond about a 
microsecond later.  In the NMI trace, the NMI occurs after the device 
writes the MMIO response.

So I'm scratching my head.  Has anyone seen such slot-specific behavior? 
  How does one account for this in a device driver?





More information about the Kernelnewbies mailing list