Asynchronous read

Adam Cozzette acozzette at cs.hmc.edu
Sun Jul 31 22:45:08 EDT 2011


On Sun, Jul 31, 2011 at 03:58:55PM -0700, Da Zheng wrote:
> Hello,
> 
> I'm trying to understand the read operation in VFS, and get confused by the
> asynchronous and synchronous operations.
> 
> At the beginning, do_sync_read() invokes aio_read, which is
> generic_file_aio_read for ext4. generic_file_aio_read should be asynchronous
> read. But what really confuses me is do_generic_file_read, which is called by
> generic_file_aio_read. It seems to me do_generic_file_read implements
> synchronous read as this is the only function I can find that copy data to the
> user space by invoking the actor callback function. If do_generic_file_read is
> synchronous, how can generic_file_aio_read be asynchronous?
> 
> In do_generic_file_read, if the data to be read isn't in the cache, normally
> page_cache_sync_readahead should be called. As far as I understand, when
> page_cache_sync_readahead returns, the pages will be ready in the cache, but the
> corresponding data in the disk isn't necessarily copied to the pages yet
> (because it eventually only invokes submit_bio to submit the IO requests to the
> block layer), so PageUptodate of the requested page might still return false,
> and then do_generic_file_read tries to invoke readpage to read the page again
> instead of waiting. Since the disk is always very slow, doesn't it just waste
> CPU time? Or do I miss something?

This is a bit puzzling. I haven't figured it out but here are some things I came
across as I was trying to solve the problem.

First of all, this article might shine some light on the problem:

http://lwn.net/Articles/170954/

Essentially, a few years ago there was a simplification of the API and aio_read
and aio_write gained the ability to do vectored operations, making it possible
to eliminate readv and writev. This even made it possible for drivers and
filesystems to avoid implementing read() and write(), since the aio versions
could take care of that.

So my point is that I suspect that aio_read and aio_write are now often used in
cases where they're not actually expected to be asynchronous, just because it
simplifies the API to be able to reuse those functions for synchronous
operations. In fact the LWN article says:

    Note that this change does not imply that asynchronous operations themselves
    must be supported - it is entirely permissible (if suboptimal) for
    aio_read() and aio_write() to operate synchronously at all times.

So perhaps generic_file_aio_read is not actually asynchronous? My only other
guess is that whatever it does happens fast enough to count as asynchronous.

-- 
Adam Cozzette
Harvey Mudd College Class of 2012



More information about the Kernelnewbies mailing list