High Latency during packet transmission
Michael Krysiak
mrk191 at gmail.com
Sat Jun 22 09:37:49 EDT 2013
Sure. Also, here are our current buffer sizes.
net.core.rmem_max=8388608
net.core.wmem_max=8388608
net.ipv4.tcp_rmem=4096 262144 8388608
net.ipv4.tcp_wmem=4096 262144 8388608
net.core.netdev_max_backlog=20000 # I just increased this from 2500 this
morning while testing
net.ipv4.tcp_mem=752256 1003008 1504512
Ip:
15836310622 total packets received
0 forwarded
0 incoming packets discarded
15833968779 incoming packets delivered
10945745295 requests sent out
24 dropped because of missing route
Icmp:
713392 ICMP messages received
27 input ICMP message failed.
ICMP input histogram:
destination unreachable: 309
echo requests: 712035
echo replies: 1023
timestamp request: 5
address mask request: 15
714011 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 372
echo request: 1599
echo replies: 712035
timestamp replies: 5
IcmpMsg:
InType0: 1023
InType3: 309
InType8: 712035
InType13: 5
InType17: 15
InType37: 5
OutType0: 712035
OutType3: 372
OutType8: 1599
OutType14: 5
Tcp:
600367 active connections openings
306733 passive connection openings
29 failed connection attempts
144696 connection resets received
37 connections established
15833082121 segments received
10910239657 segments send out
34413098 segments retransmited
0 bad segments received.
140134 resets sent
Udp:
173202 packets received
65 packets to unknown port received.
0 packet receive errors
378536 packets sent
UdpLite:
TcpExt:
[2/9230]
6 invalid SYN cookies received
5 resets received for embryonic SYN_RECV sockets
450342 TCP sockets finished time wait in fast timer
129 time wait sockets recycled by time stamp
1961229 packets rejects in established connections because of timestamp
2319351 delayed acks sent
4920 delayed acks further delayed because of locked socket
Quick ack mode was activated 24892595 times
1803250 packets directly queued to recvmsg prequeue.
9817384 packets directly received from backlog
7446718877 packets directly received from prequeue
7251538419 packets header predicted
684279 packets header predicted and directly queued to user
1691392185 acknowledgments not containing data received
8410953710 predicted acknowledgments
14145456 times recovered from packet loss due to SACK data
7 bad SACKs received
Detected reordering 35 times using FACK
Detected reordering 926621 times using SACK
Detected reordering 52787 times using time stamp
3572157 congestion windows fully recovered
7696538 congestion windows partially recovered using Hoe heuristic
TCPDSACKUndo: 9245647
2212 congestion windows recovered after partial ack
179224 TCP data loss events
TCPLostRetransmit: 1590
6911 timeouts after SACK recovery
101 timeouts in loss state
22817593 fast retransmits
11134591 forward retransmits
435599 retransmits in slow start
3531 other TCP timeouts
13687 sack retransmits failed
24988980 DSACKs sent for old packets
39271 DSACKs sent for out of order packets
15785429 DSACKs received
2567 DSACKs for out of order packets received
137242 connections reset due to unexpected data
70 connections reset due to early user close
1 connections aborted due to timeout
TCPDSACKIgnoredOld: 466
TCPDSACKIgnoredNoUndo: 743779
TCPSpuriousRTOs: 72
TCPSackShifted: 170376999
TCPSackMerged: 62950559
TCPSackShiftFallback: 166800040
TCPBacklogDrop: 407
IpExt:
InBcastPkts: 5
InOctets: 64795864053403
OutOctets: 153981389131849
InBcastOctets: 2880
On Sat, Jun 22, 2013 at 9:06 AM, Vaughn at MSN <vclinton at msn.com> wrote:
> can you post your netstat –s on one of your servers that is having the
> problem?
>
> *From:* Michael Krysiak <mrk191 at gmail.com>
> *Sent:* Saturday, June 22, 2013 7:00 AM
> *To:* kernelnewbies at kernelnewbies.org
> *Subject:* High Latency during packet transmission
>
> I've been trying to identify why we're seeing frequent stalls during
> packet transmission in our GPFS cluster in the bnx2 driver (as well as
> other NICs/drivers), but I am at the limit of my current knowledge. I used
> perf netdev events (as described in http://lwn.net/Articles/397654/) to
> measure the tx times, and see spikes such as the following:
>
> dev len Qdisc netdevice free
> em2 98 807740.878085sec 0.002msec 0.061msec
> em2 98 807740.878119sec 0.002msec 0.029msec
> em2 98 807741.140600sec 0.005msec 0.092msec
> em2 65226 807742.763833sec 0.007msec 0.436msec
> em2 66 807727.081712sec 0.001msec 16246.072msec
> em2 66 807740.882741sec 0.001msec 3457.625msec
>
>
> Based on the source for netdev-times.py, the "free" column is the
> difference between trace_net_dev_xmit() and trace_kfree_skb() in
> net/core/dev.c, but I'm not sure how to dig any deeper. Are there any
> common causes for this behavior? What's the best way to further break down
> the time difference between the xmit and kfree trace points?
>
> ------------------------------
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130622/84b9538f/attachment-0001.html
More information about the Kernelnewbies
mailing list