<br><br>суббота, 19 октября 2024 г. пользователь Muni Sekhar <<a href="mailto:munisekharrms@gmail.com">munisekharrms@gmail.com</a>> написал:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Linux Kernel Developers,<br>

<br>

I am encountering a soft lockup issue in my system related to the<br>

continuous while loop in the empty_rx_fifo() function. Below is the<br>

relevant code:<br>

<br>

<br>

#include <linux/io.h> // For readw()<br>

<br>

#define FIFO_STATUS 0x0014<br>

#define FIFO_MAN_READ 0x0015<br>

#define RX_FIFO_EMPTY 0x01 // Assuming RX_FIFO_EMPTY is defined as 0x01<br>

<br>

static inline uint16_t read16_shifted(void __iomem *addr, u32 offset)<br>

{<br>

    void __iomem *target_addr = addr + (offset << 1); // Left shift<br>

the offset by 1 and add to the base address<br>

    uint16_t value = readw(target_addr); // Read the 16-bit value from<br>

the calculated address<br>

    return value;<br>

}<br>

<br>

void empty_rx_fifo(void __iomem *addr)<br>

{<br>

    while (!(read16_shifted(addr, FIFO_STATUS) & RX_FIFO_EMPTY)) {<br>

        read16_shifted(addr, FIFO_MAN_READ); // Keep reading from the<br>

FIFO until it's empty<br>

    }<br>

}<br>

<br>

Explanation:<br>

Function Name: read16_shifted — The function reads a 16-bit value from<br>

an offset address with a left shift operation.<br>

Operation: It shifts the offset left by 1 (offset << 1), adds it to<br>

the base address, and reads the value from the new address.<br>

The empty_rx_fifo function is designed to clear out the RX FIFO, but<br>

I've encountered soft lockup issues. Specifically, the system logs<br>

repeated soft lockup messages in the kernel log, with a time gap of<br>

roughly 28 seconds between them (as per the kernel log timestamps).<br>

Here's an example log:<br>

<br>

watchdog: BUG: soft lockup - CPU#0 stuck for 23s!<br>

<br>

In all cases, the RIP points to:<br>

RIP: 0010:read16_shifted+0x11/0x20<br>

<br>

<br>

Analysis:<br>

The soft lockup seems to be caused by the continuous while loop in the<br>

empty_rx_fifo() function. The RX FIFO takes a considerable amount of<br>

time to empty, sometimes up to 1000 seconds. As a result, from the<br>

first occurrence of the soft lockup trace, the log repeats<br>

approximately every 28 seconds for the entire 1000 seconds duration.<br>

After 1000 seconds, the system resumes normal operation.<br>

<br>

Questions:<br>

1. How should I best handle this kind of issue? Even if the hardware<br>

takes time, I would like advice on the best approach to prevent these<br>

lockups.</blockquote><div><br></div><div> I guess that you can switch on interrupt model or run a thread to check the status there (here I mean check RX empty and release cpu)</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

2. Do soft lockup issues auto-recover like this? Is this something I<br>

should consider serious, or can it be ignored?</blockquote><div><br></div><div>The kernel tells you that your cpu resource is stuck instead of doing something useful</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I would appreciate any guidance on how to resolve or mitigate this problem.<br>

<br>

<br>

-- <br>

Thanks,<br>

Sekhar<br>

<br>

______________________________<wbr>_________________<br>

Kernelnewbies mailing list<br>

<a href="mailto:Kernelnewbies@kernelnewbies.org">Kernelnewbies@kernelnewbies.<wbr>org</a><br>

<a href="https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies" target="_blank">https://lists.kernelnewbies.<wbr>org/mailman/listinfo/<wbr>kernelnewbies</a><br>

</blockquote><br><br>-- <br>Regards / Mit besten Grüßen,<br>Denis<br><br>