What rationale behind physical and virtual kernel memory layout ?

Fri Oct 19 05:33:41 EDT 2012

Hi all,

This a vast topic. But I believe it's worthwhile exposing in greater detail
the "Why this design has been chosen ?", before the "How this design has
been implemented ?". And I think this is a common lack (or at least not
enough developed) in documentations, even in the more outstanding of them.

Let's try to be more precise :

A] About the direct mapping of the first 896MB of the virtual kernel space
to the first 896MB of the physical memory

Thus the kernel is able to handle a big virtual memory area mapped to a
zone of contiguous physical addresses. This is a need for some peripheral
devices which are unable to deal with paged memory. So all right, I can
understand that (ref: "Avoiding - and fixing - memory
fragmentation<http://lwn.net/Articles/211505/>"
at lwn.net). But this article also says this is required for large kernel
data structures...

Q: ...Here I don't get the point : why wouldn't it be possible for the
kernel to handle its structures through paged memory not necessarily
physically contiguous ?

In the bible "Understanding the Linux Kernel" third edition, at section
"8.3 Non-contiguous Memory Area Management" it is written : "[...] it is
preferable to map areas into sets of contiguous page frames, thus making
better use of the cache and achieving lower average memory access times".

Q: Is this assertion still valid on modern architecture ? Can please
someone explain in further detail the theory behind it, or point me to
relevant documentations ?

Q: Do you see any other reason for this physically-contiguous memory
requirement ?

B] About the 3G/1G split of process' virtual address space on 32-bit
architecture

The article "Virtual Memory I: the problem <http://lwn.net/Articles/75174/>"
at lwn.net says it has to see with the TLB : by sharing the process'
virtual space, we also share the TLB and thus we avoid a costly TLB flush
at each user-space to kernel-space switch. The article invokes this reason
of degradation performance when talking about the 4G/4G split patch (from
Ingo Molnar).

Q: In the 4G/4G split case, I don't see why we have to necessarily flush
the TLB when switching from user-space to kernel-space ? Why the TLB
couldn't be shared across one user-space to kernel-space switch, and be
flushed only every two switches ?

The TLB is small, Ok... but is it really a matter of TLB size ? I mean the
kernel can rely on big pages for backing its virtual space : this is
already the case for Intel Pentium (ref: "Understanding the Linux Kernel"
third edition) which uses 4MB pages, or for Freescale e500v2/e500mc family
which uses 256MB pages. On this architecture, if it had to map 4GB of
physical memory, it could even use a single huge page, thus occupying a
single page table entry, and so a single TLB entry...

Thanks to anyone for bringing a gentle breeze of cleverness on my cloudy
brain,
Telenn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20121019/122a4881/attachment.html