Adding a sched_class after the removal of ".next" regarding priority
Paulo Miguel Almeida
paulo.miguel.almeida.rodenas at gmail.com
Thu May 20 20:25:07 EDT 2021
On Fri, May 21, 2021 at 12:05:27AM +0200, J Mårtensson wrote:
> Hi,
> I have been trying to add a new scheduler to the Linux kernel. I have
> found that to add a sched_class, I need to add it to SCHED_DATA in
> vmlinux.lds.h. instead of editing the now removed .next variable.
I'm assuming that you are referring to this patch from Steven Rostedt, right?
https://lore.kernel.org/lkml/20191219214451.340746474@goodmis.org/
> Depending on what order I put into the priority list, it will crash
> the kernel during the booting process after rebooting. Any tips on
> what could be causing this would be appreciated!
I am not sure if I that would help, but if I were you I would try to
isolate the problem with available debugging mechanisms available in the
kernel. Have you tried compiling the kernel with CONFIG_SCHED_DEBUG=y and make use of Early Print K ?
https://www.kernel.org/doc/html/latest/x86/earlyprintk.html
Since it's rebooting due to the error, you won't be able the see the
logs... so if you enable early printk and get those messages across a
piece of hardware that's not rebooting then at least you will be able to
read those messages and add/rem statements to help you figure out what's
going wrong.
there is no silver-bullet solution for that, but I'm sure that you will
have a lot of fun trying to debug this. Once you find the solution,
please share with us. I'm sure this will be benefitial for future
developers with similar questions.
>
> Currently this works
>
> #define SCHED_DATA \
> STRUCT_ALIGN(); \
> __begin_sched_classes = .; \
> *(__idle_sched_class) \
> *(__my_sched_class) \
> *(__fair_sched_class) \
> *(__rt_sched_class) \
> *(__dl_sched_class) \
> *(__stop_sched_class) \
> __end_sched_classes = .;
>
This most likely works because during the OS booting, all processes
executed have their scheduling needs sorted out from the dl_sched_class to the
fair_sched_class. So either there is no moment when the CPU is idle
throughout the process (unlikely) or the bug on your _my_sched_class
isn't triggered when there is nothing in the CPU run queue.
I can be wrong though, so if anyone has a better explanation, please
chime in.
> While this does not
>
> #define SCHED_DATA \
> STRUCT_ALIGN(); \
> __begin_sched_classes = .; \
> *(__idle_sched_class) \
> *(__fair_sched_class) \
> *(__my_sched_class) \
> *(__rt_sched_class) \
> *(__dl_sched_class) \
> *(__stop_sched_class) \
> __end_sched_classes = .;
>
It's hard to speculate about the reason why it's failing but if I was a
gambling man I would say that *given the fact* that __my_sched_class has a higher-priority than the
__fair_sched_class, it breaks when trying to execute the __my_sched_class methods defined in the DEFINE_SCHED_CLASS macro.
Example from the fair.c sched class: https://github.com/torvalds/linux/blob/02dbb7246c5bbbbe1607ebdc546ba5c454a664b1/kernel/sched/fair.c#L11261-L11304
Paulo Miguel Almeida
>
> Regards
> Jacob
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
More information about the Kernelnewbies
mailing list