module vs main kernel

Wed Dec 4 13:16:42 EST 2013

<rcpilot2010 at gmail.com> wrote:
>>>> I have my implementation of socket APIs,
>>>>
>>>> I sock_unregister(AF_INET); & sock_register(&inet_family_ops), this replaces
>>>> kernel resident socket related calls with my socket related calls. My code
>>>> is loaded as kernel module.
>>>>
>>>> My question, is Linux kernel able to call its own socket call more
>>>> efficiently (less overhead, fewer CPU cycles) than mine ? code is running on
>>>> Intel x86_64 arch.
>>>>
>>>> Any pointer is appreciated.
>>>
>>>
>>> IMHO, strictly from module vs core kernel code perspective, there is
>>> no speed difference.
>>>
>>> The reason is, once the kernel is loaded, it is the kernel itself. So
>>> it is not separated into isolated segment or something like that. Yes
>>> module is loaded into vmalloc-ed memory area, but that's it.
>>
>> Is it possible that since my code of socket API is located in
>> different part of memory, so that it needs to make a long jump
>> (results in d-cache miss) ?
>>
>> here is statistics: from per stats tool:
>>
>> Linux code:
>> 112,880,163,053 instructions:HG           #    0.72  insns per cycle
>>
>> My code:
>> 32,074,097,170 instructions:HG           #    0.34  insns per cycle
>
> Hi...
>
> please Cc: to kernelnewbies as well.

Sorry about this, I thought I did.

> as for d-cache miss, I think that's not due to long jump, but
> accessing something not cache aligned.

You mean cache aligned or alignment of address to 4 or 8 bytes ? even
data alignment have much effect of the performance on Intel ?

> Or maybe you can think something that can be pref etched. That way, you
> utilize the processor pipeline better, but use it wisely

How about TLB misses, as I understand main kernel is stored in
permanently mapped area while module is stored in dynamically mapped
area