Page fault in kernel code

Valdis.Kletnieks at vt.edu Valdis.Kletnieks at vt.edu
Tue Sep 9 11:51:44 EDT 2014


On Tue, 09 Sep 2014 18:53:55 +0530, Manavendra Nath Manav said:

> Why is it so? Why can't kernel mode code handle the page fault and reload
> the page from swap? Also, can page fault occur when kernel is executing in
> process context and/or interrupt context?

There's no inherent chiseled-in-stone rule that says "the operating systems
kernel may not page fault", and in fact many operating systems allow it. The
IBM OS/360 family, starting with VS/1 and MVS (as OS/360's MFT and MVT variants
ran on hardware that didn't do virtual memory) clear through Z/OS 40 years
later now all supported having part of their kernel be pageable.  I've worked
with several Unix variants that allowed parts of the kernel to be pageable.

But that's a design decision that adds little real benefit, especially on
today's large RAM systems - even a Raspberry Pi has enough memory that you
don't really need to worry about making the kernel pageable.

Cautionary tale:  I once had a UTX/32 system that had routines for recovery
from disk errors (in particular, recovering and forwarding of bad blocks to
spare blocks was done by the host, *not* the device), and supported having
about 1/3 of the kernel code be pageable (this was in 1985 or so, and a
Powernode/9080 with 16M of RAM was a *big* system, so being able to put 500K of
a 1.5M kernel out on disk was a big win for performance).  I'll let you think
about what sort of afternoon I had the day that we kept hitting an I/O error on
a bad block in the swap area (which quite reasonably paused all I/O to the
failing disk until the error recovery routine ran), while the block-forwarder
module was swapped out....

(And I've had to debug similar dork-ups in VS/1, VM/SP, and MUSIC as well.
Actually... hmm, yep.  I think I've seen every single OS I've worked with in 3
decades that supported paged kernel end up shooting itself in the foot because
the wrong thing was paged out at the wrong time. That stuff is *hard* to get
right...)

That sort of thing is why Linus decided Just Say No. ;)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 848 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140909/19b06859/attachment.bin 


More information about the Kernelnewbies mailing list