HI Greg,<br> Thanks a lot. Everything you said made complete sense to me but when i tried running with following options my read is so slow (basically with direct io, that with 1MB/s it will just take 32minutes to read 32MB data) yet my write is doing fine. Should i use some other options of dd (though i understand that with direct we bypass all caches, but direct doesn't guarantee that everything is written when call returns to user for which i am using fdatasync).<br>
<br>time dd if=/dev/shm/image of=/dev/sbd0 bs=4096 count=262144 oflag=direct conv=fdatasync<br>time dd if=/dev/pdev0 of=/dev/null bs=4096 count=2621262144+0 records in<br>262144+0 records out<br>1073741824 bytes (1.1 GB) copied, 17.7809 s, 60.4 MB/s<br>
<br>real 0m17.785s<br>user 0m0.152s<br>sys 0m1.564s<br><br><br>I interrupted the dd for read because it was taking too much time with 1MB/s :<br>time dd if=/dev/pdev0 of=/dev/null bs=4096 count=262144 iflag=direct conv=fdatasync<br>
^C150046+0 records in<br>150045+0 records out<br>614584320 bytes (615 MB) copied, 600.197 s, 1.0 MB/s<br><br><br>real 10m0.201s<br>user 0m2.576s<br>sys 0m0.000s<br><br>Thanks,<br>Neha<br><br><div class="gmail_quote">
On Thu, Apr 11, 2013 at 1:49 PM, Greg Freemyer <span dir="ltr"><<a href="mailto:greg.freemyer@gmail.com" target="_blank">greg.freemyer@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Thu, Apr 11, 2013 at 2:50 PM, neha naik <<a href="mailto:nehanaik27@gmail.com" target="_blank">nehanaik27@gmail.com</a>> wrote:<br>
> Yes. Interestingly my direct write i/o performance is better than my direct<br>
> read i/o performance for my passthrough device... And that doesn't make any<br>
> kind of sense to me.<br>
><br>
> pdev0 = pass through device on top of lvm<br>
><br>
> root@voffice-base:/home/neha/sbd# time dd if=/dev/pdev0 of=/dev/null bs=4096<br>
> count=1024 iflag=direct<br>
> 1024+0 records in<br>
> 1024+0 records out<br>
> 4194304 bytes (4.2 MB) copied, 4.09488 s, 1.0 MB/s<br>
><br>
> real 0m4.100s<br>
> user 0m0.028s<br>
> sys 0m0.000s<br>
><br>
> root@voffice-base:/home/neha/sbd# time dd if=/dev/shm/image of=/dev/pdev0<br>
> bs=4096 count=1024 oflag=direct<br>
> 1024+0 records in<br>
> 1024+0 records out<br>
> 4194304 bytes (4.2 MB) copied, 0.0852398 s, 49.2 MB/s<br>
><br>
> real 0m0.090s<br>
> user 0m0.004s<br>
> sys 0m0.012s<br>
><br>
> Thanks,<br>
> Neha<br>
<br>
</div>I assume your issue is caching somewhere.<br>
<br>
If in the top levels of the kernel, dd has various fsync, fdatasync,<br>
etc. options that should address that. I note you aren't using any of<br>
them.<br>
<br>
You mention LVM. It should pass cache flush commands down, but some<br>
flavors of mdraid will not the last I knew. ie. Raid 6 used to<br>
discard cache flush commands iirc. I don't know if that was ever<br>
fixed or not.<br>
<br>
If the cache is in hardware, then dd's cache flushing calls may or may<br>
not get propagated all the way to the device. Some battery backed<br>
caches actually intentionally reply ACK to a cache flush command<br>
without actually doing it.<br>
<br>
Further, you're only writing 4MB. Not much of a test for most<br>
devices. A sata drive will typically have at least 32MB of cache.<br>
One way to ensure that results are not being corrupted by the various<br>
caches up and down the storage stack is to write so much data you<br>
overwhelm the caches. That can be a huge amount of data in some<br>
systems. ie. A server with 128 GB or ram may use 10's of GB for<br>
cache.<br>
<br>
As you can see, testing of the write path for performance can take a<br>
significant effort to ensure caches are not biasing your results.<br>
<br>
HTH<br>
<span><font color="#888888">Greg<br>
</font></span></blockquote></div><br>