Direct IO and Page cache

Chinmay V S cvs268 at gmail.com
Fri Jul 26 05:14:21 EDT 2013


> We have direct I/O(O_DIRECT), for example raw devices(/dev/rawctl) that
> map to the block devices and we also have page cache. Now If I've
> understood this correctly, direct I/O will bypass this page cache, which
> is fine, I'll not get into the performance debate, but what about data
> consistency. Kernel cannot and __should'nt__ try to control how the
> applications are being written. So one bad day somebody comes up with
> an application which does both these two types of IO(one that goes
> through page cache and the other that doesn't) and in that application,
> one instance is writing directly to the backend device and the other
> instance, who is not aware of this write, goes ahead and writes to the
> page cache, and that write would be written later to the backend device.
> So wouldn't we end up corrupting the on disk data.

Yes. And that is the responsibility of the application. While the
existence of O_DIRECT may not be common sense, anyone who knows about
it *must* know that it bypasses the kernel page-cache and hence *must*
know the consequences of doing cached and direct I/O on the same file
simultaneously.

> I can think of multiple other scenarios which could corrupt the on-disk
> data, if there isn't any safeguarding policies employed by the kernel.
> But I'm very much sure that kernel is aware of such nasty attempts, and
> I'd like to know how does kernel takes care of this.

O_DIRECT is an explicit flag not enabled by default.

It is the app's responsibility to ensure that it does NOT misuse the
feature. Essentially specifying the O_DIRECT flag is the app's way of
saying - "Hey kernel, i know what i am doing. Please step aside and
let me talk to the hardware directly. Please do NOT interfere."

The kernel happily obliges.

Later, the app should NOT go crying back to kernel (and blaming it),
if the app manages to screw-up the direct "relationship" with the
hardware.

regards
ChinmayVS



More information about the Kernelnewbies mailing list