Work (really slow directory access on ext4)

Arlie Stephens arlie at worldash.org
Fri Jul 25 21:08:35 EDT 2014


On Jul 25 2014, Valdis.Kletnieks at vt.edu wrote:
> On Fri, 25 Jul 2014 15:23:42 -0700, Arlie Stephens said:
> 
> > If you want an annoying problem, explain and/or fix directory
> > performance on ext4. I've got a server where an ls of a directory took
> > 5 seconds, according to "time", even though it only has 295 entries at
> > present.
> 
> I don't suppose you could get a trace of where that ls is spending its
> time with the kernel's trace facilities, or even just getting a stack trace
> of where that ls is in the kernel?

These are all very good questions. 

To my amazement, I found that no one had yet fixed the problem by
deleting and recreating the directory, and I do have sudo access. 
This time it was only 4 seconds...
     real 0m3.992s
     user 0m0.005s
     sys  0m0.052s

> I'll go out on a limb and ask if a *second* ls of the same directory runs
> quickly because it's now cache-hot.  If so, I'd start looking at whether
> there's large amounts of *other* disk activity going on, and the reads of the
> directory are getting hung in the I/O queue behind other disk
> read/writes.

Sure enough, the cache saved me on a second read - 
     real 0m0.010s
     user 0m0.000s
     sys  0m0.010s

> Also, are you doing an 'ls' (which just requires reading the name/inode#
> pairs), or an 'ls -l' whihc in addition requires a stat() call to read in the
> inode itself)?  That makes a lot of difference.  Cache-cold on my laptop, and a
> *huge* Mail/linux-kernel directory (yes, it really *is* an 11M directory,
> it's got a half-million entries in it):

I was doing a vanilla ls. So was the original reporter, unless he has
some really strange aliases.


I'm afraid I'll be rather unpopular if I drop the caches on the system
in question, creating a burst of poor performance, so my best bet is
probably to see what I can do with ftrace on Monday, or perhaps
partway through the weekend.  

There is normally a fair amount of disk activity going on - much of it
writes. So I can expect cached blocks to age out in a reasonable time. 


> [~] echo 3 >| /proc/sys/vm/drop_caches
> [~] cd Mail
> [~/Mail] time ls linux-kernel/ | wc -l
> 478401
> 
> real    0m2.387s
> user    0m0.500s
> sys     0m0.433s
> [~/Mail] ls -ld linux-kernel/
> drwxr-xr-x. 2 valdis valdis 11005952 Jul 25 19:30 linux-kernel/

Compared to your directory, mine is microscopic

$ ls -ld xxxx
drwxr-xr-x 2 yyy yyy 36864 Jul 25 12:19 xxxx


> [~/Mail] time ls -l linux-kernel/ | wc -l
> 478402
> 
> real    0m32.915s
> user    0m2.483s
> sys     0m20.787s

-- 
Arlie

(Arlie Stephens					arlie at worldash.org)



More information about the Kernelnewbies mailing list