weird printk operation early at start_kernel..

Chan Kim ckim at etri.re.kr
Wed Mar 2 21:50:27 EST 2022


 

Hi, I've forgotten about this question.

Recently I found that the reason printk in the next line seemed to have
started when the program is stuck during the first printk processing was
because there was an exception and the exception handler also calls printk.
I thought the printk was coming from the next printk in start_kernel( ) but
it was from data abort exception handler.

The data abort exception occurred because in armv8, when the address is not
aligned for a device region, the mmu faults. (I set non-cacheable for the
8MB memory region just to see the value I wrote in memory using ocd, but
this made the mmu think that region is a device region because it's
non-cacheable.  The first printk was calling memcpy for byte stream and it
was not word aligned, because it's just byte stream) So I reverted the
memory attribute to normal and used __flush_dcache_area to see the data I
wrote in memory.

Hope this helps anyone.

Thanks!

Chan Kim

 

From: Chan Kim <ckim at etri.re.kr> 
Sent: Thursday, February 17, 2022 4:26 PM
To: kernelnewbies at kernelnewbies.org
Subject: weird printk operation early at start_kernel..

 

Hello all,

 

I'm debugging linux boot on an FPGA board (with limited RAM size, only 8MB
SRAM in place of DDR3 memory).

I'm using linux 5.4.21 for now and because the printk doesn't work yet, I'm
writing some info on memory to see how it's doing.

After it goes into start_kernel, it calls smp_setup_processor_id(); and
inside it, it calls pr_info which is just printk.

 

==== init/main.c

uint64_t mydebug2[64] = {0,};

uint32_t myidx = 0;

uint64_t mycnt = 0; // for left shift, 64bit

 

asmlinkage __visible void __init start_kernel(void)

{

    char *command_line;

    char *after_dashes;

    uint64_t el;

 

set_task_stack_end_magic(&init_task);

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1000;

    smp_setup_processor_id();

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x2000;

 

==== arch/arm64/kernel/setup.c

void __init smp_setup_processor_id(void)

{

    u64 mpidr = read_cpuid_mpidr() & MPIDR_HWID_BITMASK;

    cpu_logical_map(0) = mpidr;

 

    /*

     * clear __my_cpu_offset on boot CPU to avoid hang caused by

     * using percpu variable early, for example, lockdep will

     * access percpu variable inside lock_release

     */

    set_my_cpu_offset(0);

mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1100;

    pr_info("test message\n"); // just printk

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1200;

    pr_info("Booting Linux on physical CPU 0x%010lx [0x%08x]\n", // printk

        (unsigned long)mpidr, read_cpuid_id());

}

 

==== kernel/printk/printk.c

asmlinkage __visible int printk(const char *fmt, ...)

{

    va_list args;

    int r;

 

    va_start(args, fmt);

    //why.. read_pkcontext(0x32);

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1110;

    r = vprintk_func(fmt, args);

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1120;

    va_end(args);

 

    return r; 

}

 

==== kernel/printk/printk_safe.c

__printf(1, 0) int vprintk_func(const char *fmt, va_list args)

{

    /* 

     * Try to use the main logbuf even in NMI. But avoid calling console

     * drivers that might have their own locks.

     */

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1111;

    if ((this_cpu_read(printk_context) & PRINTK_NMI_DIRECT_CONTEXT_MASK) &&

       raw_spin_trylock(&logbuf_lock)) {

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1112;

        int len;    

 

        len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);

        raw_spin_unlock(&logbuf_lock);

        defer_console_output();

        return len;

    }   

 

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1113;

    /* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */

    if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) {

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1114;

        return vprintk_nmi(fmt, args);

    }

 

    //write_pkcontext();   

    /* Use extra buffer to prevent a recursion deadlock in safe mode. */

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1115;

    if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) {

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1116;

        return vprintk_safe(fmt, args);

    }

    

    /* No obstacles. */ 

    mydebug2[(myidx++)&63] = (mycnt++ << 32) + 0x1117;

    return vprintk_default(fmt, args); 

}

 

When I examine mydebug2 data, it is like this. (the physical address of
mydebug2 is 0x80622000 and I can read the value in memory using ocd,

The LSB 32 bits are displayed first here. The memory is set non-cacheable
for this test.)

 

0x80622000: 00001000 00000000 00001100 00000001 00001110 00000002 00001111
00000003

0x80622020: 00001113 00000004 00001115 00000005 00001117 00000006 00001110
00000007

0x80622040: 00001111 00000008 00001113 00000009 00001115 0000000a 00001116
0000000b

0x80622060: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000

 

As you can see, in the vprintk_safe function, it goes into vprintk_default
and then without 0x1120 being written, printk is called again and this time
it goes into vprintk_nmi and seems to hang (waiting for console lock I
remember..).

Could anyone give me some light on what's going on here? (Actual print to
serial port is supposed to come out much later inside setup_arch).

Any comment will be very much appreciated.

 

Thank you!

Chan Kim

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20220303/58e88f78/attachment-0001.html>


More information about the Kernelnewbies mailing list