Thread scheduling in 2.6 kernels

Sun Feb 27 20:23:54 EST 2011

Hi Mandeep,

         What is the preemptive level you have set for your kernel,
Check that one, and find out from your third party who provided
scheduler, the algorithm, and how it modifies the nice values.

If the thread scheduling policy was set to SCHED_OTHER than the third
party scheduler is been used. If you set thread schd policy to
SCHED_FIFO for both decoder and rendering thread and set rendering
thread to higher priority it will do for you. The other decoder thread
can be in busy loop. Why do not create a notifier for decoder thread,
so that it will wake up only when data is available.

Also, you need to tune your thread nr time and policies based on bit
rate of data you are rendering. If you can run in interims of bit rate
time both the threads, rendering and decoding, that creates a smooth
picture. Thats the catch.

Are you using multi core to do the job or single core.

--Sri.

On Thu, Feb 24, 2011 at 8:47 AM, Mandeep Sandhu
<mandeepsandhu.chd at gmail.com> wrote:
>> Quite long questions you have below...but I'll try to summarize and answer....
>
> I did try to be as concise as possible! :)
>
>>
>> Btw, your problem description is great....I believe it helps (at least
>> /me) to get a sense what you gonna do, what you've done and how it
>> really works. A nice example for every one of us....
>
> Thanks
>
>>
>>> We're working in an MIPS based embedded system, running a fairly old
>>
>> OK, I take a bold note here. I only have in touch with x86 32 bit, so
>> what I am going to say might be completely wrong it is brought to MIPS
>> realm.
>
> No probs...even I'm no expert in MIPS (rather my first time with MIPS
> as well!:))
>
> The only thing that I found which _might_ be pertinent to our
> discussion was that the multi-threading option for MIPS  was disabled
> ("MIPS MT options (Disable multithreading support.)" ). Since this is
> a vendor provided config option I have not changed it. So no processor
> MT support for apps.
>
>>
>>> Linux 2.6.22 kernel (with vendor provided BSP). We write UI
>>
>> I remember vaguely that CFS (Complete Fair Scheduler) was improved
>> somewhere after 2.6.22 version...I couldn't recall exactly what
>> changes they are...
>
> The vendor provided linux kernel has the "Staircase Deadline"
> scheduler patched into it...so no CFS here...
>
>>
>> In fact, the latest "200 lines famous patch" also affect how scheduling works...
>
> Yeah I read about it (thoug I couldn't grasp how the thing actually
> works)...I have the user-space variant of this soln running on my
> ubuntu box :)
>
>>
>> Why not shifting the network I/O to the decoder threat? or IMHO,
>> better...another separate thread? So each other could
>> overlap...between CPU computation and I/O.
>
> We have tested running the app with just the decoding bit disabled in
> the decoder thread. The animation is pretty smooth...though thats also
> because there's not much to do w/o the images! :)
>
> QT handles n/w i/o pretty well, in a non-blocking, async
> manner...though I'm not sure if it is internally using separate
> threads for doing so...will have to find out.
>
>>
>> one is lowest, latter is highest? hmmmm if we put that back to pre CFS
>> era, that could mean a very different time slice assignment...or in
>> simpler word...kinda bad idea. I think if it's using nice value, it's
>> better if the difference is around 5 or 10 by maximum.
>
> The idea of assigning 2 extreme pri's was to ensure that the decode
> thread never interferer's with the main thread while animation is
> going on. It's almost like the main thread needs "real-time" priority
> while it's doing animation...and goes back to normal priority when
> idle! :)
>
> I think SD sched uses nice values...I'm also not certain whether the
> QT wrappers are assigning "nice" values when one tries to set priority
> to a thread...will have to check and get back.
>
>>
>> wait, so decoder just "eat" the content of the buffer without being
>> signaled before? in other word, it just work all the time?
>
> I'm not sure i follow your question here.
>
> The main thread _copies_ raw data rx'ed from the n/w and adds it to a
> "job queue" of the decoder thread...a fxn in the decoder thread simply
> checks if there are any jobs in the queue...if there is...it accesses
> the data (which was copied earlier when adding the job) and decodes
> the image...
>
> This is where had the 2 types of implementations...i.e in one...this
> job queue is checked continuously like:
>
> while(true) {
>    if (job-queue is NOT empty) {
>       // do decode
>    }
> }
>
> And in the second implementation:
>
> while(true) {
>    if (job-queue is NOT empty) {
>        // do decode
>    } else {
>        // wait for main thread to signal us when a new job is available
>    }
> }
>
> The "waiting" (in 2nd implementation) is done via thread
> synchronization primitives available in QT
> (http://doc.qt.nokia.com/4.6/qwaitcondition.html)
>
>>
>>
>> I think this is the problem and that's why I proposed to isolate the
>> network I/O into separate thread. It's like ping pong, main thread
>> push new data, decoder thread wait...it is then woken
>> up..decoding...main thread waits....
>>
>> Technically it is called priority inversion..if I got it correctly
>> about your situation.
>
> Hmmm...n/w io doesn't seem to be affecting animation perf of main
> thread (as pointed above)...it's just that when the decoder thread has
> a job to do..I need it to be preempted by the main thread so it can
> complete its animation w/o the other thread taking away precious CPU
> cycles...
>
> I'm going to try an "renice"-ing the decoder thread to a higher value
> and see if it changes the behaviour in the 2nd implementation (where
> we don't busy-loop)...
>
>>
>> Fixed? I don't think so. CFS is kinda using "delta" i.e if current
>> task runs for x and other which is waiting is y, then for the next
>> round, others deserve some kind of weighted x-y.
>
> SD sched, i think, assigns a fixed quota of runtime (= timeslice?) and
> if the process uses up this quota...it's priority is reduced to the
> next level....
>>
>>> - How can I find out if the kernel supports NPTL (kernel managed
>>> threads) or plain old linux threads (user-space managed threads)?
>>
>> I think this trick might work: Check /proc/<pid>/maps or use pmap.
>> NPTL ones usually maps libtls in its process address space
>
> pmap's not available! :(
>
> and i couldn't see libtls mapped in this process's addr space (is it
> really libtls? why would we have TLS library for NPTL?...isn't libtls
> used for SSL communications?)
>
>>
>> so, no coreutils/util-linux/util-linux-ng?
>
> coreutils is there.....but most commands are stripped down/lightweight
> versions of the originals! :)
>
>>
>>> Any other way to get more thread related info about a running application?
>>
>> everything under /proc/<pid>? have you checked that?
>
> This helped a little!
>
> I can see the threads spawned by the main thread under
> "/proc/<pid>/task". This dir lists pid's of all the threads started by
> the parent proc...and contents of individual dir (pids) is same as
> "/proc/<pid>"...
>
> Here I could find out my decoder thread's ID...but again contents of
> that dir does not show info like priority/nice value etc...
>
> Thanks again for your inputs. I'll keep posting my findings
> here...till I get a satisfactory soln to this issue.
>
> Regards,
> -mandeep
>
>>
>> --
>> regards,
>>
>> Mulyadi Santosa
>> Freelance Linux trainer and consultant
>>
>> blog: the-hydra.blogspot.com
>> training: mulyaditraining.blogspot.com
>>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

-- 
Regards,
Sri.