PCI error handlers in Linux

Alvin Abitria abitria.alvin at gmail.com
Tue Sep 16 14:04:04 EDT 2014


Hello,

In my pci driver for a certain pci device, I implemented the pci error
handler functions (error_detected, slot_reset methods, etc).  I want
to trigger a pci error for me to exercise those handlers and observe
its behavior.  I've read from the pci error recovery kernel
documentation that the 1st step is with error_detected method, called
by the system if it detected any error related to the pci device.  The
good thing is that the system will detect it for the driver,
simplifying things.  But I'm having problems with error detection
itself.

I tried to trigger the error via the PCI device.  On its FW, I
triggered a reset of its PCI subsystem.  As a result, the I/O rate
dropped to zero and the driver now can't send to the device.
Something indeed happened in their PCI connection.  However, I
couldn't see my error_detected method being called, when I was
expecting the kernel will detect the PCI error and call the handler.
Instead, some warning message appeared in the console as follows:

irq 16: nobody cared
handlers:
...
...
Disabling IRQ # 16

What baffles me more is that the injected PCI error seemed to brought
down that IRQ 16 device as well - which is definitely not the irq # of
my driver/device.  Any thoughts on why the kernel did not detect that
PCI error?  Is there anything I could possibly missed during
registration of error handler methods?

If that is so, I'd like to ask for other means of injecting PCI
errors, in order for me to exercise my error handlers.  Thanks!



More information about the Kernelnewbies mailing list