doubt on schedule_work() - work task getting scheduled lately

Fri Aug 5 03:22:51 EDT 2016

On Mon, Aug 1, 2016 at 6:34 PM, Daniel. <danielhilst at gmail.com> wrote:
> Did you tried ftrace? https://www.kernel.org/doc/Documentation/trace/ftrace.txt
>
> I've been using this to measure some latencies. The problem here is
> visualizing the output. If you need someting more elaborated than
> simple prints with timestamps and delta calculations, then you should
> try something more complex. If not you can enable FTRACE and generate
> trace output with delta timestamps on it, event for interrupts :)

No, I haven't tried ftrace.

I need to measure latencies between schedule_work() and actual
execution start of my work function. workqueue_queue_work &
workqueue_execute_start are their corresponding lttng events.

 I installed the lttng in my test setup and simply enabled all
available kernel events.

I am able to start & stop tracing.

Few lines of lttng view trace:

[11:07:08.208795325] (+0.000001677) testbox irq_handler_entry: {
cpu_id = 1 }, { irq = 16, name = "debug_irq" }

[11:07:08.208822703] (+0.000027378) testbox workqueue_queue_work: {
cpu_id = 1 }, { work = 0xFFFF8801396D4F18, function =
0xFFFFFFFFC07273B0, req_cpu = 256 }

[11:07:08.208824380] (+0.000001677) testbox workqueue_activate_work: {
cpu_id = 1 }, { work = 0xFFFF8801396D4F18 }

[11:07:08.208826615] (+0.000002235) testbox irq_handler_exit: { cpu_id
= 1 }, { irq = 16, ret = 1 }

[11:07:08.208831364] (+0.000004749) testbox workqueue_execute_start: {
cpu_id = 1 }, { work = 0xFFFF8801396D4F18, function =
0xFFFFFFFFC07273B0 }

[11:07:08.208841422] (+0.000010058) testbox workqueue_execute_end: {
cpu_id = 1 }, { work = 0xFFFF8801396D4F18 }

Can it be possible to print function name instead of ‘function =
0xFFFFFFFFC07273B0’?

Reproducing the 100 msec latency in between workqueue_queue_work &
workqueue_execute_start needs to trigger the longer soak test anything
over 24 hours.

If I record the lttng trace for longer hours and when I 'lttng view'
the trace, I get all sorts of messages like:

"[warning] Tracer discarded XXXX events between <timestamp1> and
<timestamp2>.  You  should consider recording a new trace with larger
buffers or with fewer events enabled."

Any better ideas to catch the kernel trace event log for millisecond
latency in between workqueue_queue_work & workqueue_execute_start?

>
> Best regards,
>
> 2016-08-01 7:32 GMT-03:00 Muni Sekhar <munisekharrms at gmail.com>:
>> On Fri, Jul 29, 2016 at 9:05 PM, Daniel. <danielhilst at gmail.com> wrote:
>>> Nice tool @Ricardo!
>>>
>>> 2016-07-29 10:48 GMT-03:00 Ricardo Ribalda Delgado <ricardo.ribalda at gmail.com>:
>>>> you can use http://lttng.org/ for analyzing this
>> Thanks Ricardo, I will use this.
>>
>>>>
>>>> Regards!
>>>>
>>>> On Fri, Jul 29, 2016 at 12:44 PM, Pranay Srivastava <pranjas at gmail.com> wrote:
>>>>> On Fri, Jul 29, 2016 at 4:02 PM, Muni Sekhar <munisekharrms at gmail.com> wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> I have a doubt regarding the workqueue scheduling.
>>>>>>
>>>>>> I am using the workqueue for processing the Rx Interrupt data. I am
>>>>>> calling schedule_work() on receiving the Rx interrupt from hardware.
>>>>>>
>>>>>> I calculated the time between calling the schedule_work() and
>>>>>> workqueue task actually getting executed, Here I see many cases of
>>>>>> less than 100 us(It is fairly good).
>>>>>>
>>>>>> But sometimes I see 10’s of ms and a lot in the 100’s of ms. I have
>>>>>> seen over 0.5 secs too. I would like to know why does sometimes kernel
>>>>>> takes longer time(in milli seconds) to schedule it? Is there any way
>>>>>> to reduce this time gap?
>>>>>>
>>>>>>
>>>>>> My code:
>>>>>>
>>>>>> static void my_workqueuetask(struct work_struct *work)
>>>>>> {
>>>>>>     printk("In %s() \n",__func__);
>>>>>>
>>>>> You probably don't need this if it's just for your work_fn, yeah but
>>>>> if there's race between this and something else...
>>>>>>     mutex_lock(&bh_mutex);
>>>>>>
>>>>>>     //Do something here
>>>>>>
>>>>>>     mutex_unlock(&bh_mutex);
>>>>>> }
>>>>>>
>>>>>>
>>>>>> static irqreturn_t my_irq_handler(int irq, void *dev)
>>>>>> {
>>>>>>         ……;
>>>>>>
>>>>>>         if(Rx Interrupt)
>>>>>>              schedule_work(&rx_work);
>>>>>
>>>>> Maybe system_wq has too much already on it's plate?
>>>>> Have you tried the same with completion and a kthread? or
>>>>> with your own workqueue, overkill but you can give it a shot.
>> I have not tried this. First I will analyze with lttng and then
>> attempts this. Does using our own workqueue reduces the latency?
>>
>>>>>>
>>>>>> return IRQ_HANDLED;
>>>>>> }
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Sekhar
>>>>>>
>>>>>> _______________________________________________
>>>>>> Kernelnewbies mailing list
>>>>>> Kernelnewbies at kernelnewbies.org
>>>>>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>         ---P.K.S
>>>>>
>>>>> _______________________________________________
>>>>> Kernelnewbies mailing list
>>>>> Kernelnewbies at kernelnewbies.org
>>>>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>>
>>>>
>>>>
>>>> --
>>>> Ricardo Ribalda
>>>>
>>>> _______________________________________________
>>>> Kernelnewbies mailing list
>>>> Kernelnewbies at kernelnewbies.org
>>>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>>
>>>
>>>
>>> --
>>> "Do or do not. There is no try"
>>>   Yoda Master
>>>
>>> _______________________________________________
>>> Kernelnewbies mailing list
>>> Kernelnewbies at kernelnewbies.org
>>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>>
>>
>>
>> --
>> Thanks,
>> Sekhar
>
>
>
> --
> "Do or do not. There is no try"
>   Yoda Master

-- 
Thanks,
Sekhar