pid start time and new display field in proc pid stat
Navin P
navinp0304 at gmail.com
Thu Mar 25 08:53:03 EDT 2021
Hi,
As of 5.11 kernel (pid,pid_start_time) is not unique /monotonic even
though the underlying counters are .
I chose start_boottime because i wanted the counter to increase
during suspend as well.
1. Is there any case where task->start_boottime or
ktime_get_boottime_ns doesn't become monotonic i.e increasing ?
2. If start_boottime is not monotonic which counter to use ?
3. If i create a new field in task_struct , then i can use a
atomic_add_return(1,&v) to fill in the task->new_field. Will this also
work ?
By doing this <pid,pid_start_time> becomes unique.
In linux/fs/proc/array.c at line 566 we have
/* apply timens offset for boottime and convert nsec -> ticks */
start_time =
nsec_to_clock_t(timens_add_boottime_ns(task->start_boottime));
task->start_boottime is a monotonic increasing counter fetched from
ktime_get_boottime_ns in fork.c
nsec_to_clock_t contains div_u64 due to which we loose some lower
bits/digits on divison
and is not unique unless the divisor is 1.
if CONFIG_HZ = 250 and nsec_to_clock_t x=4000001 , then
#if (NSEC_PER_SEC % USER_HZ) == 0
return div_u64(x, NSEC_PER_SEC / USER_HZ);
becomes div_u64(x, 4000000) then return value is 1
when x=4000002, return value is 1
until x=8000000 which returns 2.
The value shown in /proc/[pid]/stat is actually the truncated value.
Hence i'm planning to display a counter at the end of /proc/[pid]/stat
as the 53rd field.
I've prepared a patch as inlined.
>From a2c6b5d6435394f015d38700008ff74f16dfa5fd Mon Sep 17 00:00:00 2001
From: Navin P <navinp0304 at gmail.com>
Date: Thu, 25 Mar 2021 15:27:30 +0530
Subject: [PATCH] Display task->start_boottime as 53rd field in
/proc/[pid]/stat. The 22nd field start_time currently shown in
/proc/[pid]/stat as start_time is truncated by division.Hence it is not
unique .
Signed-off-by: Navin P <navinp0304 at gmail.com>
---
Documentation/filesystems/proc.rst | 1 +
fs/proc/array.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/Documentation/filesystems/proc.rst
b/Documentation/filesystems/proc.rst
index 48fbfc336ebf..3b7a1543b2c0 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -381,6 +381,7 @@ It's slow but very precise.
env_end address below which program environment is placed
exit_code the thread's exit_code in the form reported by the waitpid
system call
+ start_boottime the process start time in nanoseconds since boot
============= ===============================================================
The /proc/PID/maps file contains the currently mapped memory regions and
diff --git a/fs/proc/array.c b/fs/proc/array.c
index bb87e4d89cd8..74389aaefa9c 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -645,6 +645,7 @@ static int do_task_stat(struct seq_file *m, struct
pid_namespace *ns,
else
seq_puts(m, " 0");
+ seq_put_decimal_ull(m, " ", task->start_boottime);
seq_putc(m, '\n');
if (mm)
mmput(mm);
--
2.25.1
More information about the Kernelnewbies
mailing list