Module vs Kernel main performacne

Peter Senna Tschudin peter.senna at gmail.com
Thu Jun 7 19:36:15 EDT 2012


Hi again!

On Tue, May 29, 2012 at 8:50 PM, Abu Rasheda <rcpilot2010 at gmail.com> wrote:
> Hi,
>
> I am working on x8_64 arch. Profiled (oprofile) Linux kernel module
> and notice that whole lot of cycles are spent in copy_from_user call.
> I compared same flow from kernel proper and noticed that for more data
> through put cycles spent in copy_from_user are much less. Kernel
> proper has 1/8 cycles compared to module. (There is a user process
> which keeps sending data, like iperf)
>
> Used perf tool to gather some statistics and found that call from kernel proper
>
> 185,719,857,837 cpu-cycles               #    3.318 GHz
>     [90.01%]
>  99,886,030,243 instructions              #    0.54  insns per cycle
>       [95.00%]
>    1,696,072,702 cache-references     #   30.297 M/sec
>   [94.99%]
>       786,929,244 cache-misses           #   46.397 % of all cache
> refs     [95.00%]
>  16,867,747,688 branch-instructions   #  301.307 M/sec
>   [95.03%]
>         86,752,646 branch-misses          #    0.51% of all branches
>       [95.00%]
>    5,482,768,332 bus-cycles                #   97.938 M/sec
>        [20.08%]
>    55967.269801 cpu-clock
>    55981.842225 task-clock                 #    0.933 CPUs utilized
>
> and call from kernel module
>
>  9,388,787,678 cpu-cycles               #    1.527 GHz
>    [89.77%]
>  1,706,203,221 instructions             #    0.18  insns per cycle
>    [94.59%]
>    551,010,961 cache-references    #   89.588 M/sec                   [94.73%]
>   369,632,492 cache-misses           #   67.083 % of all cache refs
>  [95.18%]
>   291,358,658 branch-instructions   #   47.372 M/sec                   [94.68%]
>    10,291,678 branch-misses           #    3.53% of all branches
>   [95.01%]
>  582,651,999 bus-cycles                 #   94.733 M/sec
>     [20.55%]
>  6112.471585 cpu-clock
>  6150.490210 task-clock                 #    0.102 CPUs utilized
>                367 page-faults                #    0.000 M/sec
>                367 minor-faults                #    0.000 M/sec
>                    0 major-faults                #    0.000 M/sec
>           25,770 context-switches        #    0.004 M/sec
>                 23 cpu-migrations            #    0.000 M/sec

How did you call from Kernel module?

>
>
> So obviously, CPU is stalling when it is copying data and there are
> more cache misses. My question is, is there a difference calling
> copy_from_user from kernel proper compared to calling from LKM ?
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

[]'s

-- 
Peter Senna Tschudin
peter.senna at gmail.com
gpg id: 48274C36



More information about the Kernelnewbies mailing list