Question about switch_mm function

Wed Mar 25 15:13:55 EDT 2015

On Wed, Mar 25, 2015 at 10:33 AM, Rajat Sharma <fs.rajat at gmail.com> wrote:
>
>
> On Mar 25, 2015 10:31 AM, "Sreejith M M" <sreejith.mm at gmail.com> wrote:
> >
> > On Wed, Mar 25, 2015 at 10:55 PM,  <Valdis.Kletnieks at vt.edu> wrote:
> > > On Wed, 25 Mar 2015 21:35:22 +0530, Sreejith M M said:
> > >
> > >> > This code is handling context switch from a kernel thread back to user mode
> > >> > thread so TLB entries are invalid translation for user mode thread and do
> > >> > not correspond to user process pgd. It is Master kernel page table
> > >> > translation as a result of kernel thread execution.
> > >> >
> > >> > -Rajat
> > >> Hi Rajat,
> > >>
> > >> If that is the case, why this code is put under CONFIG_SMP switch?
> > >
> > > Vastly simplified because I'm lazy :)
> > >
> > > If you look at the code, it's poking the status on *other* CPUs.  That's why
> > > the cpumask() stuff.
> > >
> > > If you're on a single execution unit, you don't have to tell the other
> > > CPU about the change in state, because there isn't an other CPU.
> >
> > can you come out of this lazy mode explain this a bit more because I
> > am a newbie ?or tell me what else I should know before I have to
> > understand this code
> >
> > --
> > Regards,
> > Sreejith
>
> Valdis is talking about lazy tlb flush, not him being lazy. Otherwise he wouldn't have replied at all :)

Okay bit more details, I admit I had to dig through bit more to find
this out. After all, we all are newbies :)

On SMP system, there is an optimization called lazy TLB mode for
kernel threads. Follow the steps:

1. Assume that some of the CPU are executing a multithreaded user mode
application so essentially they all share same mm and page tables.
2. Now lets say some other CPU changes/assigns physical page frame to
user mode linear address, tets say as a result of processing a system
call on behalf of user mode process. Putting data in user mode buffer
etc. It needs to invalidate old TLB entry for this linear address in
local page table.
3. Since application is multithreaded, some other CPU sharing the same
page table will have old values for corresponding linear address in
its TLB.
4. Normally we would invalidate TLB entries of all CPUs sharing this page table.
5. Now suppose some of the participating CPUs were running a kernel
thread and does not want to be bothered about this change as it has
nothing to do with user mode pages TLB entries, it makes its executing
CPU with do not disturb mode called lazy TLB mode.
6. TLB invalidation of all CPU executing kernel thread are deferred
till kernel thread is finished.
7. At this point, when kernel thread switches back to user mode
process, the invalidation is done and is the code which are are
referring to.

Just in case, if you wonder where is invalidation happening, so
invalidation is arch specific step. In most simple way it is flush all
TLB entries and let it build up over a period of time in future.
That's why it is costly and optimization like lazy TLB mode pays off.
how it is done in x86 is by loading cr3.

http://stackoverflow.com/questions/1090218/what-does-this-little-bit-of-x86-doing-with-cr3

-Rajat