Userspace app crash causes system crash on do_exit probe

César Augusto Marcelino dos Santos dev.cmsantos at gmail.com
Tue Sep 1 03:16:22 EDT 2020


Dear community,

I have created a kernel module that adds probes to do_execve() and
do_exit() syscalls (code by the end of this email). It is running on a
custom kernel-based system, version 3.18.31.

The goal of this module is to see if I can capture several information
from any process that is about to start, or that is about to leave
userspace. I have tested the following scenarios:
- app inits
- app finishes its execution gracefully
- app is killed
- app crashes

The first three cases I can retrieve information from the process, but
on the last case, I am having an unexpected Kernel Oops. More
specifically, I am having trouble on retrieving command-line arguments
from a process, and seems to be due to some unusual race condition.

To ease things, I have simplified the original source code and focused
on the command-line part. It can be noticed that “getCommandLine()”
function is not being shown here, and the reason is because is a copy
of get_cmdline() method from mm/util.c
(https://elixir.bootlin.com/linux/latest/source/mm/util.c#L855).

This version of get_cmdline() is using synchronization mechanisms (in
my case, I have implemented it with semaphores instead of spinlocks),
which causes the Kernel to crash:
    ...
    BUG: scheduling while atomic: mysegfaultapp/6037/0x00000002
    Modules linked in: ...
    CPU: 0 PID: 9313 Comm: mysegfaultapp Tainted: P        W  O   3.18.31 #2
    [<c0014024>] (unwind_backtrace) from [<c00119f0>] (show_stack+0x10/0x14)
    [<c00119f0>] (show_stack) from [<c0039830>] (__schedule_bug+0x44/0x60)
    [<c0039830>] (__schedule_bug) from [<c0838040>] (__schedule+0x68/0x470)
    [<c0838040>] (__schedule) from [<c083a864>]
(rwsem_down_read_failed+0x104/0x130)
    [<c083a864>] (rwsem_down_read_failed) from [<bf000918>]
(getCommandLine.constprop.0+0x44/0x160 [mymodule])
    [<bf000918>] (getCommandLine.constprop.0 [mymodule]) from
[<bf000644>] (doExitHandler+0x1dc/0x25c [mymodule])
    [<bf000644>] (doExitHandler [mymodule]) from [<c0021850>]
(SyS_exit_group+0x0/0x10)
    [<c0021850>] (SyS_exit_group) from [<00000009>] (0x9)
    Unable to handle kernel paging request at virtual address fffffffe
    pgd = dbc20000
    [fffffffe] *pgd=9f3f8821, *pte=00000000, *ppte=00000000
    Internal error: Oops: 80000007 [#1] PREEMPT ARM
    ...

But if I use an implementation without synchronization mechanisms
(which is the one that matches my kernel version -
https://elixir.bootlin.com/linux/v3.18.31/source/mm/util.c#L355), once
a running app causes segmentation fault and crashes, I am not able to
report its command-line, but system remains running (for reference,
this app is a dummy app that causes a segfault on purpose, here called
“mysegfaultapp”).

Due to those situations, I have a few questions that I hope the
community can give me some directions on where to look further and
understand:
1) Is it possible to retrieve the command-line arguments from a
userspace process that crashed?
2) How can I inspect the reason for this crash on rwsem_down_read_failed?
3) If I go for the v.3.18.31 version that doesn’t use synchronization
structures (semaphores or spinlocks), what are the risks?


Please let me know if you need further information, or if you have any
questions.


Thanks in advance,
Cesar.


-------------------------------------------------------------------------------------------------------------------------------------------
static struct kretprobe initProcess;
static struct jprobe exitProcess;

static void doExitHandler(long code) {
    char commandLine[200];
    memset(commandLine, 0, sizeof(commandLine));

    if (getCommandLine(current, commandLine, sizeof(commandLine)) <= 0) {
        strcpy(commandLine, "ERROR");
    }

    printk(KERN_INFO "doExitHandler %s\n", commandLine);
    jprobe_return();
}

static int doExecHandler(struct kretprobe_instance *pMetadata, struct
pt_regs *pRegs) {
    char commandLine[200];
    memset(commandLine, 0, sizeof(commandLine));

    if (getCommandLine(current, commandLine, sizeof(commandLine)) <= 0) {
        strcpy(commandLine, "ERROR");
    }

    printk(KERN_INFO "doExecHandler %s\n", commandLine);
    return 0;
}

static int myInit(void) {
    int retval;

    initProcess.kp.symbol_name = "do_execve";
    initProcess.handler = doExecHandler;
    retval = register_kretprobe(&initProcess);

    exitProcess.kp.symbol_name = "do_exit";
    exitProcess.entry = JPROBE_ENTRY(doExitHandler);
    retval = register_jprobe(&exitProcess);

    return retval;
}

static void myExit(void) {
    unregister_kretprobe(&initProcess);
    unregister_jprobe(&exitProcess);
}

module_init(myInit);
module_exit(myExit);



More information about the Kernelnewbies mailing list