<div dir="auto">Valdis, what a valuable answer. It opened my eyes. I didn't take the most important thing in account, caches only help in cache hit!<div dir="auto"><br></div><div dir="auto">I'll try using a disk on memory (residing on a tmpfs mount) for improving this. Good idea!<br><div dir="auto"><br></div><div dir="auto">Thank you so much for sharing this with me!!!</div><div dir="auto"><br></div><div dir="auto">Regards</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">Em 05/07/2018 10:21 PM,  <<a href="mailto:valdis.kletnieks@vt.edu" target="_blank" rel="noreferrer">valdis.kletnieks@vt.edu</a>> escreveu:<br type="attribution"><blockquote class="m_7229748243774679766quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="m_7229748243774679766quoted-text">On Thu, 05 Jul 2018 19:30:22 -0300, "Daniel." said:<br>

<br>

> Sometime we have a machine that we work on and that is really really slow<br>

> when doing I/O. I know that kernel will use memory to avoid doing I/O, and<br>

> that it would be a kind of conservative in avoiding keep to much data on<br>

> volatile memory susceptible to being lost on power failure. My question is,<br>

> how to do the opposite, and avoid I/O as much as possible, doesn't matter<br>

> the risks?<br>

<br></div>

You're trying to solve a problem that isn't the one you have....<br>

<br>

The way the kernel avoids I/O is if a read or write is done, it keeps a copy in memory<br>

in case another request uses that same data again.<br>

<br>

On most filesystems, a userspace write doesn't go directly to disk - it just<br>

goes into the in-memory cache as a "dirty" page, and gets written out to disk<br>

later by a separate kernel thread.  In other words, unless your system has gone<br>

off the deep end into thrashing, writes to disk generally won't block.<br>

<br>

Meanwhile, if a userspace read finds the data in the cache, it will just return the<br>

data, and again not block. Usually, the only time a disk I/O will block is if it does<br>

a read that is *not* in the in-memory cache already.<br>

<br>

The end result is that the effectiveness of the cache depends on what percent of<br>

the reads are already in there.  And now the bad news...<div class="m_7229748243774679766quoted-text"><br>

<br>

> I'm using a virtual machine to test some ansible playbooks, the machine is<br>

> just a testing environment so it will be created again and again and again.<br>

> (And again). The playbook generates a lot of I/O, from yum installs, and<br>

> another commands that inspect ISO images to create repositories,  ... it<br>

<br></div>

Ansible is famous for generating *lots* of disk reads (for example, 'lineinfile' will<br>

usually end up reading most/all of the file, especially if the expected line isn't in there.<br>

And if you're testing against a blank system, the line probably isn't in there, so you have<br>

to read the whole file...   And how many times do you have more than one 'lineinfile'<br>

that hits the same file? Because that's the only time the in-memory cache will help, the<br>

second time the file is referenced.   And I'll bet you that reading ISO images to create<br>

repositories generates a lot of non-cacheable data - almost guaranteed unless you read<br>

the same ISO image more than once.  Similarly for yum installs - each RPM will only be<br>

read once, clogging the in-memory cache.<div class="m_7229748243774679766quoted-text"><br>

<br>

> Anyway. The idea is that the flushing thread enters as soon as possible and<br>

> that the blocking happens as late as possible so that I leave disks working but<br>

> avoid I/O blocking.<br>

<br></div>

Unfortunately, that's not how it works.  If you want to avoid blocking, you want to<br>

maximize the cache hits (which is unfortunately *very* difficult on a system install<br>

or ansible run).<br>

<br>

You might be able to generate some win by either using a pre-populated tmpfs<br>

and/or using an SSD for the VM's disk.<br>

<br>

And you may want to look at more drastic solutions - for instance, using an NFS<br>

mount from the hypervisor machine as the source for your ISOs and repositories.<br>

 Under some conditions, that can be faster than the VM doing I/O that then has<br>

to be hypervisor handled, adding to the overhead.  (This sort of thing is an<br>

*old* trick, dating back to Sun systems in the 3/50 class that had 4M of<br>

memory.  It was faster to put the  swap space on a SCSI Fujitsu Eagle disk attached<br>

to a 3/280 server and accessed over NFS over Ethernet than using the much<br>

slower 100M "shebox" IDE drive that could be directly attached to a 3/50)<br>

<br>

<br>

</blockquote></div><br></div>