Active Memory never reclaimed please help me understand
Prem Kumar
prem.it.kumar at gmail.com
Wed Sep 23 16:46:50 EDT 2015
Dear Mulyadi,
Thank you for your response. Sorry for top posting it, I was waiting for my
posting to arrive in my mail-box but it took for ever and hence top posted
in eagerness. You are right in your observation that I couldn't possibly
have had my first application crash at 6GB. I should have said about 15GB.
I have many nodes I just picked the outputs from couple and presented by
observation. Below here I will try to dissect my observation in the hope
you can help me understand this, my OS concepts have become a little and
haven't been in touch with them.
Here is a machine that Currently has this state:
$ free -g
total used free shared buffers cached
Mem: 23 14 8 0 0 0
-/+ buffers/cache: 14 9
Swap: 0 0 0
I have a program that just globs memory here is what happens when I run
this:
$ ./eatmemory 8.99G
Eating 8589934592 bytes in chunks of 1024...
Done, press any key to free the memory
$ ./eatmemory 9G
Eating 9663676416 bytes in chunks of 1024...
Killed
I believe the above observation is nothing wrong, because RAM is used by
what other(assuming running) applications and I only have so much available
for my program to run.
But my issue is nothing else other than system services are running on this
machine, this renders this node un-usable for the next program that runs on
this machine and when request more than what 9G as above. Below here is the
output of /proc/meminfo from the same machine
$ cat /proc/meminfo
MemTotal: 24724728 kB
MemFree: 9402768 kB
Buffers: 0 kB
Cached: 217464 kB
SwapCached: 0 kB
Active: 14650896 kB
Inactive: 60456 kB
Active(anon): 14647052 kB
Inactive(anon): 40632 kB
Active(file): 3844 kB
Inactive(file): 19824 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 0 kB
Writeback: 0 kB
AnonPages: 14493928 kB
Mapped: 19544 kB
Shmem: 193720 kB
Slab: 109720 kB
SReclaimable: 12300 kB
SUnreclaim: 97420 kB
KernelStack: 2968 kB
PageTables: 39100 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 12362364 kB
Committed_AS: 15684044 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 493316 kB
VmallocChunk: 34346062668 kB
HardwareCorrupted: 0 kB
AnonHugePages: 13936640 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 7652 kB
DirectMap2M: 25145344 kB
Also here is my ulimit which is unlimited:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 192912
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 81920
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
And /proc/self/maps
$ cat /proc/self/maps
00400000-0040b000 r-xp 00000000 00:10 67455633
/bin/cat
0060a000-0060b000 rw-p 0000a000 00:10 67455633
/bin/cat
0060b000-0060c000 rw-p 00000000 00:00 0
0080a000-0080b000 rw-p 0000a000 00:10 67455633
/bin/cat
0209f000-020c0000 rw-p 00000000 00:00 0
[heap]
36d7e00000-36d7e20000 r-xp 00000000 00:10 67454760
/lib64/ld-2.12.so
36d801f000-36d8020000 r--p 0001f000 00:10 67454760
/lib64/ld-2.12.so
36d8020000-36d8021000 rw-p 00020000 00:10 67454760
/lib64/ld-2.12.so
36d8021000-36d8022000 rw-p 00000000 00:00 0
36d8200000-36d838a000 r-xp 00000000 00:10 67456999
/lib64/libc-2.12.so
36d838a000-36d858a000 ---p 0018a000 00:10 67456999
/lib64/libc-2.12.so
36d858a000-36d858e000 r--p 0018a000 00:10 67456999
/lib64/libc-2.12.so
36d858e000-36d858f000 rw-p 0018e000 00:10 67456999
/lib64/libc-2.12.so
36d858f000-36d8594000 rw-p 00000000 00:00 0
7f754caad000-7f754cab0000 rw-p 00000000 00:00 0
7f754cac2000-7f754cac3000 rw-p 00000000 00:00 0
7fff5e496000-7fff5e4ab000 rw-p 00000000 00:00 0
[stack]
7fff5e5f8000-7fff5e5f9000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
On every machine i ran into this problem, anonpages are eating up the
memory,in effect shrinking the available RAM for the programs to run.
Q) Now my question is since the previous job/program that ran on this
machines has finished or died: My OS concepts tell me that the recently
used cached-anonpages will be released to meet the request of another
application requesting to use up the memory/vm. What am I missing here to
understand?
Also what I fail to understand is the state in which my diskelss & swapless
nodes remain: What/who has control over the used up memory, why is it not
being granted for the next owner of the machine to run at full scale? I
understand that I will not have all of it but at least 19GB out of 24GB.
Also below is the list of top process on the machines: Looking at it I
don't see any heave use of memory ...mystery make me feel dumb??
$ ps aux --sort -rss
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 8402 0.0 0.0 119712 15896 ? S 12:11 0:00
/usr/libexec/sssd/sssd_nss --uid 0 --gid 0 --debug-to-files
root 9555 0.0 0.0 3508796 4680 ? S Aug18 0:33
/usr/sbin/slurmd
useralap 8231 0.0 0.0 27224 4604 pts/0 S 12:06 0:00 -bash
root 8401 0.0 0.0 151052 4264 ? S 12:11 0:00
/usr/libexec/sssd/sssd_be --domain default --uid 0 --gid 0 --debug-to-files
root 8153 0.0 0.0 111192 3240 pts/0 Ss 12:05 0:00 -bash
root 2078 0.0 0.0 720600 2968 ? Ssl Aug10 1:39 automount
--pid-file /var/run/autofs.pid
root 1752 0.0 0.0 249344 2784 ? Sl Aug10 0:04
/sbin/rsyslogd -i /var/run/syslogd.pid -c 5
useralap 9898 1.0 0.0 26196 1468 pts/0 R+ 15:42 0:00 ps aux
--sort -rss
root 8150 0.0 0.0 111816 1296 ? Ss 12:05 0:00 sshd:
root at pts/0
munge 2146 0.0 0.0 225004 1292 ? Sl Aug10 0:36
/usr/sbin/munged
68 2006 0.0 0.0 41976 1228 ? Ssl Aug10 0:10 hald
root 1671 0.0 0.0 9120 976 ? Ss Aug10 0:00
/sbin/dhclient -1 -q -lf /var/lib/dhclient/dhclient-em1.leases -pf
/var/run/dhclient-em1.pid
root 8400 0.0 0.0 114288 900 ? Ss 12:11 0:00
/usr/sbin/sssd -f -D
root 8403 0.0 0.0 105264 876 ? S 12:11 0:00
/usr/libexec/sssd/sssd_pam --uid 0 --gid 0 --debug-to-files
root 2174 0.0 0.0 20000 868 ? Ss Aug10 0:42 crond
root 2111 0.0 0.0 66188 712 ? Ss Aug10 0:00
/usr/sbin/sshd
root 3625 0.0 0.0 334616 648 ? SLsl Aug10 0:00
/usr/sbin/ibacm
root 863 0.0 0.0 10832 592 ? S<s Aug10 0:00
/sbin/udevd -d
root 3391 0.0 0.0 10828 588 ? S< Aug10 0:00
/sbin/udevd -d
root 3523 0.0 0.0 10828 588 ? S< Aug10 0:00
/sbin/udevd -d
rpcuser 1893 0.0 0.0 25428 464 ? Ss Aug10 0:00 rpc.statd
root 1736 0.0 0.0 93176 460 ? S<sl Aug10 0:07 auditd
root 1 0.0 0.0 23500 452 ? Ss Aug10 0:02 /sbin/init
root 1799 0.0 0.0 10912 452 ? Ss Aug10 6:45 irqbalance
--pid=/var/run/irqbalance.pid
root 8230 0.0 0.0 165156 448 pts/0 S 12:06 0:00 su -
useralap
rpc 1875 0.0 0.0 18976 300 ? Ss Aug10 0:02 rpcbind
dbus 1934 0.0 0.0 23484 280 ? Ss Aug10 0:00
dbus-daemon --system
root 2199 0.0 0.0 21076 212 ? Ss Aug10 0:00
/usr/sbin/atd
root 2207 0.0 0.0 21792 212 ? S Aug10 0:24
/usr/sbin/ipmievd sel pidfile=/var/run/ipmievd.pid
root 2007 0.0 0.0 20400 184 ? S Aug10 0:00 hald-runner
root 2043 0.0 0.0 22520 164 ? S Aug10 0:00
hald-addon-input: Listening on /dev/input/event0
68 2045 0.0 0.0 18008 148 ? S Aug10 0:00
hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
root 1997 0.0 0.0 4080 116 ? Ss Aug10 0:00
/usr/sbin/acpid
root 2097 0.0 0.0 6260 116 ? Ss Aug10 0:00
/usr/sbin/mcelog --daemon
root 2222 0.0 0.0 4064 76 tty2 Ss+ Aug10 0:00
/sbin/mingetty /dev/tty2
Please advise and let me know if you need more information.
-best regards!!
On Wed, Sep 23, 2015 at 11:07 AM, Mulyadi Santosa <mulyadi.santosa at gmail.com
> wrote:
>
>
> On Wed, Sep 23, 2015 at 4:47 AM, Prem Kumar <prem.it.kumar at gmail.com>
> wrote:
>
>> also wondering if there is a way I can list Active memory map showing me
>> what is cached?
>>
>> -regards.
>>
>> On Tue, Sep 22, 2015 at 3:08 PM, Prem Kumar <prem.it.kumar at gmail.com>
>> wrote:
>>
>>> Dear All,
>>>
>>> I have done quite a bit of reading on Active memory reported in
>>> /proc/meminfo and in short says it is never reclaimed unless absolutely
>>> necessary, and it caches the recently used files/pages in memory. Although
>>> I fail to understand the consequences that I face here.
>>>
>>> I have disk-less and swap-less nodes. So all I have to do, is play with
>>> the RAM on the box. Issue that brought me here is investigating why after
>>> running some applications, used memory is never available for use with any
>>> other applications.
>>>
>>> In other words I cannot run any programs that requests memory more than
>>> what is shown as free in the output of free command and MemFree in the
>>> output of the cat /proc/meminfo
>>> For example if I ran any program that requires more than 6GB on the
>>> first node below and more than 1GB on the second node below they fail
>>> instantly, and work fine if within the limist of free. There is nothing
>>> else running on the system other than system processes/services.
>>>
>>> total used free shared buffers cached
>>> Mem: 23 17 6 0 0 9
>>> -/+ buffers/cache: 8 15
>>> Swap: 0 0 0
>>>
>>> total used free shared buffers cached
>>> Mem: 23 22 1 0 0 0
>>> -/+ buffers/cache: 21 1
>>> Swap: 0 0 0
>>>
>>> Since the applications that ran previously are not running any more
>>> "even though they died out of memory because they requested more memory
>>> than available", shouldn't the OS see that any memory used previously as
>>> useless and can it not reclaim that for use with the next job/program on
>>> that machine.
>>>
>>> On every machine that I have run into this problem the out put of
>>> /proc/meminfo shows that Active memory is used up the amount shown in the
>>> free command and limits my further runs.
>>>
>>> This is driving me insane and making me feel stupid knowing that OS is
>>> smart enough to handle this, then what am I missing here to understand?
>>> Please advise.
>>>
>>> Appreciate any insight into this.
>>>
>>> Best Regards,
>>> Prem
>>>
>>>
>>>
>>
>>
>>
> Dear Prem
>
> welcome to kernelnewbies :) First of all, please don't do top posting when
> replying. Follow like what I and the rest of list member do.
>
> Btw, looking from the free output, I have a doubt about your statement
> that your first application took 6 GB and secondly it took 1 GB. Assuming
> your application doesn't thing like memory locking in kernel space, i guess
> it takes 20+ GB of RAM.
>
> So, before we go further, could you re run your applications and use ps or
> top to see both the VSIZE and RSS they take ?
>
> Regarding memory claiming, yes after app is killed (using any ways
> possible: ctrl-c, sending kill/term/quit signal, OOM etc), any memory
> allocated by this task are freed. It happen on both active and inactive
> pages
>
>
> --
> regards,
>
> Mulyadi Santosa
> Freelance Linux trainer and consultant
>
> blog: the-hydra.blogspot.com
> training: mulyaditraining.blogspot.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150923/94d5241f/attachment-0001.html
More information about the Kernelnewbies
mailing list