I2C bus driver TIMEDOUT because of PM autosuspend

Primoz Beltram primoz.beltram at kate.si
Tue Dec 3 04:26:49 EST 2019


On 3. 12. 19 02:30, anish singh wrote:
> On Fri, Nov 29, 2019 at 12:53 PM Primoz Beltram
> <primoz.beltram at gmail.com> wrote:
>> I am analysing a problem with I2C bus driver where the problem shows up
>> as I2C bus completely blocked. The LX driver in question is
>> /drivers/i2c/busses/i2c-xiic.c.
>> Problem is difficult to reproduce, it happens very rarely. So far I saw
>> that the main precondition is to have very heavy I2C traffic on bus.
>> In my case this is achieved/reproduced via netdev driving SFP LEDs via
>> /sys/class/leds/ (via gpio-pca953x). I generate traffic with iperf3.
>> Network traffic is on 10Gbps EMAC. LX kernel is 4.14.0.
>> What I saw from debugging this problem is that I2C bus get blocked when
>> wait_event_timeout() completes because of timeout. The timeout handling
>> in this driver is probably not robust enough (bus should not remain
>> blocked), but at this moment this are just my speculations (don't know
>> enough details).
> Check with salea logic analyzer what happens to the i2c bus.
>
>> Looking the driver code and data on oscilloscope, I saw that SCL in
>> single I2C data transfer sequence can be interrupted for very long
>> delays, e.g up to hundredths of usec (SCL is 100kHz). I started to
>> suspect that PM autosuspend delay could play some role here. There are
>> only two delays in driver code, first in wait_event_timeout and second
>> in set autosuspend delay. Case is a bit strange because in very busy I2C
>> traffic, PM autosuspend should not be triggered at all. Additionally, if
>> I lower PM timeout, e.g. from 1000 (default) to 100, I hit the problem
>> sooner (waits for problem hit are in order of n*10minutes).
>>
>> It looks to me that PM autosupend is playing some role here.
>>
>> Power management options in my .config:
>> # CONFIG_SUSPEND is not set
>> # CONFIG_PM is not set
>> CONFIG_ARCH_SUSPEND_POSSIBLE=y
>>
>> I intentionally did not put all detail descriptions of embedded system
>> and test setup here (long list), because the main reason of this post is:
>>
>> The workaround that works for me/customer (at the moment) is to disable
>> PM autosuspend in the driver code, either by incerementing PM delay from
>> 1000 to 10000 or by disabling autosuspend (comment out call to
>> pm_runtime_put_autosuspend() in xiic_xfer()).
>>
>> But, I would like to expose/discuss this issue (maintainer of the code,
>> or others).
>> The reason/source of the problem can be much more complex and in some
>> other place.
>>
>> So my question is who should I contact, is this the M: in the
>> MAINTAINERS list, the MODULE_AUTHOR, ...?
> You can certainly add the author in loop but I am afraid
> you won't get any help as this would be specific to your board. So,
> best is to check soc vendor who has written your i2c
> bus driver or it could be a issue with your i2c client in that
> case show them your salea logic analyzer logs to see
> if they can figure out anything wrong.

Thanks for reply and suggestions.

My first suspicion was signal integrity on PCB, but if I add some debug 
prints in i2c-xiic driver (e.g. build with DEBUG define), the problem is 
no longer reproducible (not a single timeout completion in 
wait_event_timeout()).

Signal integrity problem does not look credible to me.

For my system I fixed the problem in i2c-xiic driver (in handIing 
timeout, not leave bus blocked).

Found also a contact and fill report for SoC vendor.

WBR Primoz
>> How to proceed.
>>
>> WBR Primoz
>>
>>
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies at kernelnewbies.org
>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies





More information about the Kernelnewbies mailing list