CPU STALL ISSUE

Amit Gupta amitkgupta09 at gmail.com
Thu Jul 23 02:34:38 EDT 2015


Hi All,
I am facing one issue with Linux kernel 4.0.4.


*CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.0.4+ #19*

This issue i am facing with below mentioned two scenario

First Scenario:

At the time when I start using PCIe based Atheros wifi card along with my
Ethernet controller. If I am using both standalone then I am not facing
this issue.
(I think this is not hardware related issue as with the same combination of
h/w but on different linux kernel version (3.4), I am not facing this
issue.)

Second Scenario:

This issue also repeated if I keep on executing below mentioned command:

# hostapd -B /etc/hostapd.conf

My main focus is on first scenario.

I googled it this issue and found that few people also faced this issue
previously on Linux kernel 3.1/3.2/3.3 . Some people are saying that
disabling IPv6 support from kernel configuration will resolve this issue
but I need IPv6 support in my kernel due to project specific task.

By digging out into Linux kernel and observing dump stack log(at the time
CPU STALL), I found culprit function 'ieee80211_wake_queues_by_reason'.
I go through the series of functions calling by this function, and stuck at
one point. I found there are two  'rcu_read_lock();' in sequence.



* So my question is can we call rcu_read_lock() function is sequence(two
times) and later call  rcu_read_unlock() one by one;??*

Please look into the pictorial representation of my question:

ieee80211_wake_queues_by_reason --
                                                        |---->
__ieee80211_wake_queue --->              if
(skb_queue_empty(&local->pending[queue])) {

                                             rcu_read_lock();


ieee80211_propagate_queue_wake(local, queue);

                                             rcu_read_unlock();

}


ieee80211_propagate_queue_wake(local, queue)
                                                                        |
--> netif_wake_subqueue(sdata->dev, ac)

| -->                rcu_read_lock();


     q = rcu_dereference(txq->qdisc);


      __netif_schedule(q);


      rcu_read_unlock();






My Approach:
I increased RCU stall time from 21 jiffies to 60 jiffies.

--   CONFIG_RCU_CPU_STALL_TIMEOUT=21
++ CONFIG_RCU_CPU_STALL_TIMEOUT=60


But this does not solve my problem.



Please suggest me any approach to solve this problem.


Thanks,
Amit Gupta
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150723/1c52c326/attachment.html 


More information about the Kernelnewbies mailing list