read the memory mapped address - pcie - kernel hangs

Greg KH greg at kroah.com
Thu Jan 9 06:37:08 EST 2020


On Thu, Jan 09, 2020 at 04:44:16PM +0530, Muni Sekhar wrote:
> On Thu, Jan 9, 2020 at 1:15 AM Greg KH <greg at kroah.com> wrote:
> >
> > On Thu, Jan 09, 2020 at 12:30:20AM +0530, Muni Sekhar wrote:
> > > Hi All,
> > >
> > > I have module with Xilinx FPGA. It implements UART(s), SPI(s),
> > > parallel I/O and interfaces them to the Host CPU via PCI Express bus.
> > > I see that my system freezes without capturing the crash dump for certain tests.
> > > I debugged this issue and it was tracked down to the ‘readl()’ in
> > > interrupt handler code
> > >
> > > In ISR, first reads the Interrupt Status register using ‘readl()’ as
> > > given below.
> > >     status = readl(ctrl->reg + INT_STATUS);
> > >
> > > And then clears the pending interrupts using ‘writel()’ as given blow.
> > >         writel(status, ctrl->reg + INT_STATUS);
> > >
> > > I've noticed a kernel hang if INT_STATUS register read again after
> > > clearing the pending interrupts.
> >
> > Why would you read that register again after writing to it?
> >
> > And are you sure you are reading/writing the correct size of the irq
> > field?  I thought it was a "word" not "long"?  But that might depend on
> > your hardware, do you have a pointer to the kernel driver source you are
> > using for all of this?
> Actually no need to read that register again. But reading that
> register again should not freeze the system, right?

It might, depends on your hardware.  Go talk to the hardware vendor if
you have questions about this.

> INT_STATUS register is 32-bit width, so readl() API is used(my system
> is x86_64, Intel(R) Atom(TM) CPU). Instead of readl(), do I need to
> use readw() twice? If so what is reason for this code change?

Ok, if that register is 32 bits, that's fine.  It all depends on your
hardware.

> I’m trying to understand why system freezes without any crash dump
> while reading the memory mapped IO from interrupt context?

Because your hardware locked things up?

> FPGA code might be buggy, it may not send the completion for Memory
> Read request. But CPU should not get stuck at LOAD instruction level..

PCI hardware can do lots of bad things to a system, it _IS_ part of the
memory bus, right?  So of course it can lock the CPU at a read.

> When it hung, it does not even respond for SYSRQ button(SYSRQ is
> enabled – in normal scenario it works), only way to recover is reboot
> the system. I enabled almost all the kernel.panic* variables. I set
> the kernel.panic to positive, so it should reboot after panic instead
> of just hang. But it’s not rebooting by itself. Even 'pstore\ramoops’
> also not helped.
> After reboot I looked at the kern.log and most of the times it has
> “^@^@^@^ ...“ line just before reboot.
> 
> Okay, I will write the minimalistic code to reproduce this one and
> then share with you guys.

What's wrong with the real/full driver source?

And again, why are you trying to read the register twice?

thanks,

greg k-h



More information about the Kernelnewbies mailing list