Seeking Assistance with Spin Lock Usage and Resolving Hard LOCKUP Error

Fri May 17 16:44:16 EDT 2024

On Thu, May 9, 2024 at 8:39 AM Muni Sekhar <munisekharrms at> wrote:
> Dear Linux Kernel Community,
> I am reaching out to seek assistance regarding the usage of spin locks
> in the Linux kernel and to address a recurring issue related to hard
> LOCKUP errors that I have encountered during testing.

build your kernel with LOCKDEP everything ?

> Recently, I developed a small kernel module that involves ISR handling
> and utilizes the spinlock_t primitive. In my module, I have employed
> spin locks both in process context using spin_lock() and spin_unlock()
> APIs, as well as in ISR context using spin_lock_irqsave() and
> spin_unlock_irqrestore() APIs.
> Here is a brief overview of how I have implemented spin locks in my module:

I certainly dont know whether the above and below are legal.
Id be comparing my usage to working examples from the source-code.

and you didnt say anything about your module or what it does.
(fwiw, you'd get more help if it were "our" module, ie gpl'd)

> However, during testing, I have encountered a scenario where a hard
> LOCKUP (NMI watchdog: Watchdog detected hard LOCKUP on cpu 2) error
> occurs, specifically when a process context code execution triggers
> the spin_lock() function and is preempted by an interrupt that enters
> the ISR context and encounters the spin_lock_irqsave() function. This
> situation leads to the CPU being stuck indefinitely.

Id build w/o watchdog, to see what else goes wrong.
2 different errors might help find common cause.

> My primary concern is to understand the appropriate usage of spin
> locks in both process and ISR contexts to avoid such hard LOCKUP
> errors. I am seeking clarification on the following points:


>     Is it safe to use spin_lock_irqsave() and spin_unlock_irqrestore()
> APIs in ISR context and spin_lock() and spin_unlock() APIs in process
> context simultaneously?
>     In scenarios where a process context code execution is preempted
> by an interrupt and enters ISR context, how should spin locks be used
> to prevent hard LOCKUP errors?
>     Are there any specific guidelines or best practices for using spin
> locks in scenarios involving both process and ISR contexts?
> I would greatly appreciate any insights, guidance, or suggestions from
> the experienced members of the Linux kernel community to help address
> this issue and ensure the correct and efficient usage of spin locks in
> my kernel module.
> Thank you very much for your time and assistance.
