Safe registration of procfs entries in LKM

Torin Carey torin at
Tue Mar 1 11:56:05 EST 2022

Found answer from source so thought I'd share incase anyone else was interested:

On Thu, Feb 24, 2022 at 02:12:57PM +0000, Torin Carey wrote:
> The procfs code switched from `struct file_operations`, which has a
> `struct module *owner` member to using `struct proc_ops`, which doesn't.
> This member allowed the core code to `try_module_get()` the module
> before calling the operation, so that we can avoid calling it if the
> module is in the process of being removed and increase the module use
> count to prevent the module from being unloaded while the open file
> description exists.

The procfs code doesn't modify module usage count, but it is still safe
to use with removable modules for the following reason:

The dereferencing of the proc file operations structures as well as their
functions are guarded:

// fs/proc/inode.c

enum {BIAS = -1U<<31};

static inline int use_pde(struct proc_dir_entry *pde)
	return likely(atomic_inc_unless_negative(&pde->in_use));

static void unuse_pde(struct proc_dir_entry *pde)
	if (unlikely(atomic_dec_return(&pde->in_use) == BIAS))


static ssize_t proc_reg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
	struct proc_dir_entry *pde = PDE(file_inode(file));
	ssize_t rv = -EIO;

	if (pde_is_permanent(pde)) {
		return pde_read(pde, file, buf, count, ppos);
	} else if (use_pde(pde)) {
		rv = pde_read(pde, file, buf, count, ppos);
	return rv;

use_pde() will, if the pde isn't marked for removal (i.e. negative), increase
the pde use count and return true.  The actual module pde operations will be
called through pde_read(), and the following unuse_pde() will decrease
the pde use count afterwards.

During pde removal, the pde use count will turn negative to BIAS (marking
that it's being removed) and depending on whether the use count is still
greater than BIAS (meaning it was previously positive), it will wait for
the functions to return using a completion.

When use_pde() detects that the pde is being removed, it will return
false, which usually causes -EIO to be returned (or -ENOENT for open),
avoiding calling the operations even for an open file description.  If
unuse_pde() detects that the pde is being removed, then it knows it's
being waited on, so will complete the completion (if it's the last
user of the pde).

This, along with the open files automatically being ->release()'ed, make
procfs safe for modules.


More information about the Kernelnewbies mailing list