Non-consistent CPU usage in IP forwarding test

Fri Apr 4 00:43:00 EDT 2014

On Thursday, April 3, 2014, Oleg A. Arkhangelsky <sysoleg at yandex.ru> wrote:

> Hello all,
>
> We've got very strange behavior when testing IP packet forwarding
> performance
> on Sandy Bridge platform (Supermicro X9DRH with the latest BIOS). This is
> two
> socket E5-2690 CPU system. Using different PC we're generating DDoS-like
> traffic
> with rate of about 4.5 million packets per second. Traffic is receiving by
> two
> Intel 82599 NICs and forwarding using the second port of one of this NICs.
> All
> load is evenly distributed among two nodes, so each of 32 CPUs SI usage is
> virtually equal.
>
> Now the strangest part. Few moments after pktgen start on traffic
> generator PC,
> average CPU usage on SB system goes to 30-35%. No packet drops,
> no rx_missed_errors, no rx_no_dma_resources. Very nice. But SI usage
> starts to
> decreasing gradually. After about 10 seconds we see ~15% SI average among
> all
> CPUs. Still no packet drops, the same RX rate as in the beginning, RX
> packet
> count is equal to TX packet count. After some time we see that average SI
> usage
> start to go up. Peaked at initial 30-35% it goes down to 15% again. This
> pattern
> is repeated every 80 seconds. Interval is very stable. It is undoubtedly
> bind
> to the test start time, because if we start test, then interrupt it after
> 10
> seconds and start it again we see the same 30% SI peak in a few moments.
> Then
> all timings will be the same.
>
> During the high load time we see this in "perf top -e cache-misses":
>
>             14017.00 24.9% __netdev_alloc_skb           [kernel.kallsyms]
>              5172.00  9.2% _raw_spin_lock               [kernel.kallsyms]
>              4722.00  8.4% build_skb                    [kernel.kallsyms]
>              3603.00  6.4% fib_table_lookup             [kernel.kallsyms]
>
> During the "15% load time" top is different:
>
>             11090.00 20.9% build_skb                [kernel.kallsyms]
>              4879.00  9.2% fib_table_lookup         [kernel.kallsyms]
>              4756.00  9.0% ipt_do_table
> /lib/modules/3.12.15-BUILD-g2e94e30-dirty/kernel/net/ipv4/netfilter/ip_tables.ko
>              3042.00  5.7% nf_iterate               [kernel.kallsyms]
>
> And __netdev_alloc_skb is at the end of list:
>
>               911.00  0.5% __netdev_alloc_skb             [kernel.kallsyms]
>
> Some info from "perf stat -a sleep 2":
>
> 15% SI:
>        28640006291 cycles                    #    0.447 GHz
>       [83.23%]
>        38764605205 instructions              #    1.35  insns per cycle
>
> 30% SI:
>        56225552442 cycles                    #    0.877 GHz
>       [83.23%]
>        39718182298 instructions              #    0.71  insns per cycle
>
> CPUs never go above C1 state, all cores speed from /proc/cpuinfo is
> constant at
> 2899.942 MHz. ASPM is disabled.
>
> All non-essential userspace apps was explicitly killed for test time, there
> was no active cron jobs too. So we should assume no interference with
> userspace.
>
> Kernel version is 3.12.15 (ixgbe 3.21.2), but we have the same behavior
> with
> ancient 2.6.35 (ixgbe 3.10.16). Although on 2.6.35 we sometimes get 160-170
> seconds interval and different symbols at the "perf top" output (especially
> local_bh_enable() which is completely blows my mind).
>
> Does anybody have some thoughts about the reasons of this kind of behavior?
> Sandy Bridge CPU has many uncore/offcore events, which I can sample, maybe
> some of them can shed some light on such behavior?
>
>
Is it NUMA system ? This happens when node tries to access memory connected
to other CPU.

Abu Raheda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20140403/4b6e27a0/attachment.html