BFQ: simple elevator

Wed Mar 20 19:10:15 EDT 2013

On Wed, 20 Mar 2013 14:41:31 -0700, Raymond Jennings said:

> Suppose you have requests at sectors 1, 4, 5, and 6
>
> You dispatch sectors 1, 4, and 5, leaving the head parked at 5 and the
> direction as ascending.
>
> But suddenly, just before you get a chance to dispatch for sector 6,
> sector 4 gets busy again.
>
> I'm not proposing going back to sector 4.  It's behind us and (as you
> indicated) we could starve sector 6 indefinitely.
>
> So instead, because sector 4 is on the wrong side of our present head
> position, it is ignored and we keep marching forward, and then we hit
> sector 6 and dispatch it.
>
> Once we hit sector 6 and dispatch it, we do a u-turn and start
> descending.  That's when we pick up sector 4 again.

The problem is that not all seeks are created equal.

Consider the requests are at 1, 4, 5, and 199343245.  If as we're servicing
5, another request for 4 comes in, we may well be *much* better off doing a
short seek to 4 and then one long seek to the boonies, rather than 2 long
seeks.

My laptop has a 160G Western Digital drive in it (WD1600BJKT).  The minimum
track-to-track seek time is 2ms, the average time is 12ms, and the maximum is
probably on the order of 36ms. So by replacing 2 max-length seeks with a
track-to-track seek and 1 max-length, you can almost half the delay waiting
for seeks (38ms versus 72ms). (And even better if the target block is logically before the current one, but
still on the same track, so you only take a rotational latency hit and no seek
hit.

(The maximum is not given in the spec sheets, but is almost always 3 times the
average - for a discussion of the math behind that, and a lot of other issues,
see:

http://pages.cs.wisc.edu/~remzi/OSFEP/file-disks.pdf

And of course, this interacts in very mysterious ways with the firmware
on the drive, which can do its own re-ordering of I/O requests and/or
manage the use of the disk's onboard read/write cache - this is why
command queueing is useful for throughput, because if the disk has the
option of re-ordering 32 requests, it can do more than if it only has 1 or
2 requests in the queue.  Of course, very deep command queues have their
own issues - most notably that at some point you need to use barriers or
something to ensure that the metadata writes aren't being re-ordered into
a pattern that could cause corruption if the disk lost its mind before
completing all the writes...

> In my case I'm just concerned with raw total system throughput.

See the above discussion.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 865 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130320/079734db/attachment-0001.bin