HI Greg,<br>   Thanks a lot. Everything you said made complete sense to me but when i tried running with following options my read is so slow (basically with direct io, that with 1MB/s it will just take 32minutes to read 32MB data) yet my write is doing fine. Should i use some other options of dd (though i understand that with direct we bypass all caches, but direct doesn&#39;t guarantee that everything is written when call returns to user for which i am using fdatasync).<br>

<br>time dd if=/dev/shm/image of=/dev/sbd0 bs=4096 count=262144 oflag=direct conv=fdatasync<br>time dd if=/dev/pdev0 of=/dev/null bs=4096 count=2621262144+0 records in<br>262144+0 records out<br>1073741824 bytes (1.1 GB) copied, 17.7809 s, 60.4 MB/s<br>

<br>real    0m17.785s<br>user    0m0.152s<br>sys    0m1.564s<br><br><br>I interrupted the dd for read because it was taking too much time with 1MB/s :<br>time dd if=/dev/pdev0 of=/dev/null bs=4096 count=262144 iflag=direct conv=fdatasync<br>

^C150046+0 records in<br>150045+0 records out<br>614584320 bytes (615 MB) copied, 600.197 s, 1.0 MB/s<br><br><br>real    10m0.201s<br>user    0m2.576s<br>sys    0m0.000s<br><br>Thanks,<br>Neha<br><br><div class="gmail_quote">

On Thu, Apr 11, 2013 at 1:49 PM, Greg Freemyer <span dir="ltr">&lt;<a href="mailto:greg.freemyer@gmail.com" target="_blank">greg.freemyer@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div>On Thu, Apr 11, 2013 at 2:50 PM, neha naik &lt;<a href="mailto:nehanaik27@gmail.com" target="_blank">nehanaik27@gmail.com</a>&gt; wrote:<br>


&gt; Yes. Interestingly my direct write i/o performance is better than my direct<br>

&gt; read i/o performance for my passthrough device... And that doesn&#39;t make any<br>

&gt; kind of sense to me.<br>

&gt;<br>

&gt; pdev0 = pass through device on top of lvm<br>

&gt;<br>

&gt; root@voffice-base:/home/neha/sbd# time dd if=/dev/pdev0 of=/dev/null bs=4096<br>

&gt; count=1024 iflag=direct<br>

&gt; 1024+0 records in<br>

&gt; 1024+0 records out<br>

&gt; 4194304 bytes (4.2 MB) copied, 4.09488 s, 1.0 MB/s<br>

&gt;<br>

&gt; real    0m4.100s<br>

&gt; user    0m0.028s<br>

&gt; sys    0m0.000s<br>

&gt;<br>

&gt; root@voffice-base:/home/neha/sbd# time dd if=/dev/shm/image of=/dev/pdev0<br>

&gt; bs=4096 count=1024 oflag=direct<br>

&gt; 1024+0 records in<br>

&gt; 1024+0 records out<br>

&gt; 4194304 bytes (4.2 MB) copied, 0.0852398 s, 49.2 MB/s<br>

&gt;<br>

&gt; real    0m0.090s<br>

&gt; user    0m0.004s<br>

&gt; sys    0m0.012s<br>

&gt;<br>

&gt; Thanks,<br>

&gt; Neha<br>

<br>

</div>I assume your issue is caching somewhere.<br>

<br>

If in the top levels of the kernel, dd has various fsync, fdatasync,<br>

etc. options that should address that.  I note you aren&#39;t using any of<br>

them.<br>

<br>

You mention LVM.  It should pass cache flush commands down, but some<br>

flavors of mdraid will not the last I knew.  ie. Raid 6 used to<br>

discard cache flush commands iirc.  I don&#39;t know if that was ever<br>

fixed or not.<br>

<br>

If the cache is in hardware, then dd&#39;s cache flushing calls may or may<br>

not get propagated all the way to the device.  Some battery backed<br>

caches actually intentionally reply ACK to a cache flush command<br>

without actually doing it.<br>

<br>

Further, you&#39;re only writing 4MB.  Not much of a test for most<br>

devices.  A sata drive will typically have at least 32MB of cache.<br>

One way to ensure that results are not being corrupted by the various<br>

caches up and down the storage stack is to write so much data you<br>

overwhelm the caches.  That can be a huge amount of data in some<br>

systems.  ie. A server with 128 GB or ram may use 10&#39;s of GB for<br>

cache.<br>

<br>

As you can see, testing of the write path for performance can take a<br>

significant effort to ensure caches are not biasing your results.<br>

<br>

HTH<br>

<span><font color="#888888">Greg<br>

</font></span></blockquote></div><br>