pid start time and new display field in proc pid stat [ race in task->start_boottime ,start_time]

Navin P navinp0304 at gmail.com
Sat Apr 3 04:47:28 EDT 2021


On Sat, Apr 3, 2021 at 8:51 AM Valdis Klētnieks <valdis.kletnieks at vt.edu> wrote:
>
> On Thu, 01 Apr 2021 20:20:04 +0530, Navin P said:
>
> > After printing the task->start_boottime as 53rd field, i found
> > multiple process containing same value due to a race. The race exists
> > for p->start_time as well.
>
> Well, yeah.  There's no existing reason that the kernel shouldn't be able to
> spawn processes on two CPUs in parallel (and initialize the start_time field to
> the same value along the way).  There should be as little synchronizing locking
> as possible, and if you need to do something that requires locking, you should
> probably piggyback it on top of a code section that already has the appropriate
> locks in place.
>
> > What are the other ways  that can be used to fix this problem ?
>
> Is there a particular reason why your code *needs* to have different unique
> values for the start_time and start_boottime fields?  If your code needs
> something unique per-process, the tuple (start_time, PID) *will* be unique
> system-wide.  Or just PID.
>
> In other words, what problem are you trying to solve by making the start_time
> unique?
>
Previously i was of the assumption that (start_time ,pid) is unique
but since pid can be recycled in certain scenarios  . When there is
shortage of free pids, existing pids can be recycled.

https://bugs.chromium.org/p/project-zero/issues/detail?id=1692

The field 22 in /proc/<pid>/stat is divided by CONFIG_HZ(250 or 1000
...), so we are losing bits. Even the counter task->start_boottime is
not unique probably for the reasons you specified and
ktime_get_boottime_ns() being not race free .  Idea is to log
(ns,pid,start_time) as key into sqlite or db for live monitoring or
collection of metrics.

task->start_boottime is not unique . Hence i had two choices, one is
create an atomic counter in struct_task and use that and other is to
lock accesses to task->start_boottime . The locking approach needs
lock on tasklist_lock which seems to be a little heavy  compared to
new atomic in task_struct . The atomic in task_struct also has a
hidden advantage that it will show you the cumulative number of
forks() done since the system booted. 53rd field of cat
/prof/self/stat  && sleep 5 && cat /proc/self/stat could catch a fork
run by daemon monitoring process.

If the locking approach needs to be taken, then apart from boottime
start_time also neds lock on tasklist_lock.

I've posted the previous locking based one. The below is for a new field.

>From a19aadff6b615134379f978cf7afa2c6d99e88cd Mon Sep 17 00:00:00 2001
From: Navin P <navinp0304 at gmail.com>
Date: Sat, 3 Apr 2021 13:31:36 +0530
Subject: [PATCH] Create a new atomic64 field namely task->tpid_counter in
 task_struct. Create a static atomic64 pid_counter in fork.c which will be
 incremented once for every successful call of copy_process.

Signed-off-by: Navin P <navinp0304 at gmail.com>
---
 fs/proc/array.c       | 2 +-
 include/linux/sched.h | 2 +-
 kernel/fork.c         | 7 +++++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index 74389aaefa9c..8dc917676845 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -645,7 +645,7 @@ static int do_task_stat(struct seq_file *m, struct
pid_namespace *ns,
  else
  seq_puts(m, " 0");

- seq_put_decimal_ull(m, " ", task->start_boottime);
+ seq_put_decimal_ull(m, " ", atomic64_read(&task->tpid_counter));
  seq_putc(m, '\n');
  if (mm)
  mmput(mm);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb22164c..c1f8c19db6e2 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1370,7 +1370,7 @@ struct task_struct {
 #ifdef CONFIG_KRETPROBES
  struct llist_head               kretprobe_instances;
 #endif
-
+ atomic64_t                      tpid_counter;
  /*
  * New fields for task_struct should be added above here, so that
  * they are included in the randomized portion of task_struct.
diff --git a/kernel/fork.c b/kernel/fork.c
index 426cd0c51f9e..d2e865776f12 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -139,6 +139,8 @@ DEFINE_PER_CPU(unsigned long, process_counts) = 0;

 __cacheline_aligned DEFINE_RWLOCK(tasklist_lock);  /* outer */

+static atomic64_t pid_counter;
+
 #ifdef CONFIG_PROVE_RCU
 int lockdep_tasklist_lock_is_held(void)
 {
@@ -2225,6 +2227,11 @@ static __latent_entropy struct task_struct *copy_process(
  p->start_time = ktime_get_ns();
  p->start_boottime = ktime_get_boottime_ns();

+        /* pid_counter is the static global variable that increments each time
+         * and the previous value of pid_counter is returned.
+         */
+ atomic64_set(&p->tpid_counter,atomic64_fetch_add(1,&pid_counter));
+
  /*
  * Make it visible to the rest of the system, but dont wake it up yet.
  * Need tasklist lock for parent etc handling!
-- 
2.25.1



More information about the Kernelnewbies mailing list