NET: TCP Server Remains in ESTABLISHED State After RST Sent — 5.15 Kernel

Oracle oraclelinux at foxmail.com
Sun Jun 1 02:48:32 EDT 2025


I’m encountering an issue during a high-concurrency short-lived connection stress test in a distributed database system. The system under test is running on a single machine. The server sometimes sends a TCP RST after completing the three-way handshake, and while the client receives the RST and closes the connection, the server-side socket remains in ESTABLISHED state.

System Information
Linux systest104 5.15.0-305.176.4.el9uek.x86_64 #2 SMP Tue Jan 28 20:15:04 PST 2025 x86_64 x86_64 x86_64 GNU/Linux
Problem Description

During the short-link stress test (many TCP connections being rapidly established and closed), the server (PostgreSQL) sends an unexpected RST after the handshake. Despite the RST and client-side closure, the server socket stays in ESTABLISHED. This behavior repeats under high load.

netstat Output
[root at systest104 tools]# netstat -atpn | grep 45129 | grep LI
tcp 0 0 10.13.8.104:45129 0.0.0.0:* LISTEN 3961360/postgres:
tcp 0 0 10.13.8.104:45129 10.13.8.104:45052 ESTABLISHED 3961360/postgres:
TCP Traffic Capture (tcpdump)

From server (port 45129) to client (port 45052):

02:58:03.972859 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
10.13.8.104.45129 > 10.13.8.104.45052: Flags [S.], cksum 0x2518 (incorrect -> 0xe388), seq 175553476, ack 2894832976, win 65535, options [mss 65495,sackOK,TS val 1670382316 ecr 1670382316,nop,wscale 11], length 0
02:58:04.218997 IP (tos 0x0, ttl 64, id 9377, offset 0, flags [DF], proto TCP (6), length 52)
10.13.8.104.45129 > 10.13.8.104.45052: Flags [.], cksum 0x2510 (incorrect -> 0x0a6e), seq 1, ack 42, win 32, options [nop,nop,TS val 1670382564 ecr 1670382522], length 0
02:58:04.979788 IP (tos 0x0, ttl 64, id 9378, offset 0, flags [DF], proto TCP (6), length 52)
10.13.8.104.45129 > 10.13.8.104.45052: Flags [F.], cksum 0x2510 (incorrect -> 0x0776), seq 1, ack 42, win 32, options [nop,nop,TS val 1670383323 ecr 1670382522], length 0
02:58:06.494295 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
10.13.8.104.45129 > 10.13.8.104.45052: Flags [S.], cksum 0x2518 (incorrect -> 0x7ac8), seq 214961739, ack 2894898555, win 65535, options [mss 65495, sackOK, TS val 1670384838 ecr 1670384838,nop,wscale 11], length 0
02:58:06.497830 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)
10.13.8.104.45129 > 10.13.8.104.45052: Flags [R], cksum 0x0f95 (correct), seq 214961740, win 0, length 0
​
From client (port 45052) to server (port 45129):
10.13.8.104.45052 > 10.13.8.104.45129: Flags [S], cksum 0x2518 (incorrect -> 0x9bdf), seq 2894898554, win 65535, options [mss 65495,sackOK,TS val 1670384838 ecr 1670383323,nop,wscale 11], length 0
02:58:06.496318 IP (tos 0x0, ttl 64, id 15240, offset 0, flags [DF], proto TCP (6), length 52)
10.13.8.104.45052 > 10.13.8.104.45129: Flags [.], cksum 0x2510 (incorrect -> 0xa39b), seq 65579, ack 39408264, win 32, options [nop,nop,TS val 1670384839 ecr 1670384838], length 0
02:58:06.496321 IP (tos 0x0, ttl 64, id 15241, offset 0, flags [DF], proto TCP (6), length 84)
10.13.8.104.45052 > 10.13.8.104.45129: Flags [P.], cksum 0x2530 (incorrect -> 0x2b50), seq 65579:65611, ack 39408264, win 32, options [nop,nop,TS val 1670384839ecr 1670384838], length 32
Kernel Socket State Transitions
ffff9de27cb78000 3499722 postgres   10.13.8.104 45052 → 45129   SYN_RECV → ESTABLISHED 0.003s
ffff9dd18eb25580 3961360 postgres   10.13.8.104 45129 → 45052   SYN_SENT → ESTABLISHED 1.165s
ffff9dd18eb25580 353      ksoftirqd 10.13.8.104 45129 → 45052   ESTABLISHED → CLOSE    4.154s
Summary


The client (45052) initiates a connection to the server (45129).



The connection completes (ESTABLISHED).



Then a new SYN is seen, followed shortly by an RST from the server.



However, the server socket does not transition out of ESTABLISHED despite sending the RST.



The connection appears "stuck" in the ESTABLISHED state from the server’s perspective.


Question

What could cause the kernel to send an RST but still leave the server-side socket in ESTABLISHED state? Shouldn’t the kernel remove the connection after sending the RST?

Any insight into what may be going wrong here (socket leak, improper closure, application issue, kernel bug) would be appreciated.

Thanks,   ZHAO
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20250601/29f86e34/attachment-0001.html>


More information about the Kernelnewbies mailing list