<div dir="auto">Valdis, what a valuable answer. It opened my eyes. I didn't take the most important thing in account, caches only help in cache hit!<div dir="auto"><br></div><div dir="auto">I'll try using a disk on memory (residing on a tmpfs mount) for improving this. Good idea!<br><div dir="auto"><br></div><div dir="auto">Thank you so much for sharing this with me!!!</div><div dir="auto"><br></div><div dir="auto">Regards</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">Em 05/07/2018 10:21 PM, <<a href="mailto:valdis.kletnieks@vt.edu" target="_blank" rel="noreferrer">valdis.kletnieks@vt.edu</a>> escreveu:<br type="attribution"><blockquote class="m_7229748243774679766quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="m_7229748243774679766quoted-text">On Thu, 05 Jul 2018 19:30:22 -0300, "Daniel." said:<br>
<br>
> Sometime we have a machine that we work on and that is really really slow<br>
> when doing I/O. I know that kernel will use memory to avoid doing I/O, and<br>
> that it would be a kind of conservative in avoiding keep to much data on<br>
> volatile memory susceptible to being lost on power failure. My question is,<br>
> how to do the opposite, and avoid I/O as much as possible, doesn't matter<br>
> the risks?<br>
<br></div>
You're trying to solve a problem that isn't the one you have....<br>
<br>
The way the kernel avoids I/O is if a read or write is done, it keeps a copy in memory<br>
in case another request uses that same data again.<br>
<br>
On most filesystems, a userspace write doesn't go directly to disk - it just<br>
goes into the in-memory cache as a "dirty" page, and gets written out to disk<br>
later by a separate kernel thread. In other words, unless your system has gone<br>
off the deep end into thrashing, writes to disk generally won't block.<br>
<br>
Meanwhile, if a userspace read finds the data in the cache, it will just return the<br>
data, and again not block. Usually, the only time a disk I/O will block is if it does<br>
a read that is *not* in the in-memory cache already.<br>
<br>
The end result is that the effectiveness of the cache depends on what percent of<br>
the reads are already in there. And now the bad news...<div class="m_7229748243774679766quoted-text"><br>
<br>
> I'm using a virtual machine to test some ansible playbooks, the machine is<br>
> just a testing environment so it will be created again and again and again.<br>
> (And again). The playbook generates a lot of I/O, from yum installs, and<br>
> another commands that inspect ISO images to create repositories, ... it<br>
<br></div>
Ansible is famous for generating *lots* of disk reads (for example, 'lineinfile' will<br>
usually end up reading most/all of the file, especially if the expected line isn't in there.<br>
And if you're testing against a blank system, the line probably isn't in there, so you have<br>
to read the whole file... And how many times do you have more than one 'lineinfile'<br>
that hits the same file? Because that's the only time the in-memory cache will help, the<br>
second time the file is referenced. And I'll bet you that reading ISO images to create<br>
repositories generates a lot of non-cacheable data - almost guaranteed unless you read<br>
the same ISO image more than once. Similarly for yum installs - each RPM will only be<br>
read once, clogging the in-memory cache.<div class="m_7229748243774679766quoted-text"><br>
<br>
> Anyway. The idea is that the flushing thread enters as soon as possible and<br>
> that the blocking happens as late as possible so that I leave disks working but<br>
> avoid I/O blocking.<br>
<br></div>
Unfortunately, that's not how it works. If you want to avoid blocking, you want to<br>
maximize the cache hits (which is unfortunately *very* difficult on a system install<br>
or ansible run).<br>
<br>
You might be able to generate some win by either using a pre-populated tmpfs<br>
and/or using an SSD for the VM's disk.<br>
<br>
And you may want to look at more drastic solutions - for instance, using an NFS<br>
mount from the hypervisor machine as the source for your ISOs and repositories.<br>
Under some conditions, that can be faster than the VM doing I/O that then has<br>
to be hypervisor handled, adding to the overhead. (This sort of thing is an<br>
*old* trick, dating back to Sun systems in the 3/50 class that had 4M of<br>
memory. It was faster to put the swap space on a SCSI Fujitsu Eagle disk attached<br>
to a 3/280 server and accessed over NFS over Ethernet than using the much<br>
slower 100M "shebox" IDE drive that could be directly attached to a 3/50)<br>
<br>
<br>
</blockquote></div><br></div>