Kernel module for a network interface - remove trailer from sk_buff on reception
abhi.raa.man.v at gmail.com
Wed Jun 28 06:45:36 EDT 2023
I am implementing the Parallel Redundancy Protocol (PRP, IEC standard
62439-3) as a kernel module for a school project of mine. The code for the
module can be found at: https://github.com/ramv33/prp. I have used the code
for the HSR module which implements PRP as a reference for my
implementation since I have zero prior experience with kernel programming.
Even though a lot of the design is different, I have used the module as a
reference for what to do and how to do it.
Let me provide a brief description of what PRP is and what it does. PRP is
used to provide hitless redundancy (zero recovery time). It does this by
having two parallel and independent Ethernet networks. A device is
connected to both these networks (LAN_A and LAN_B).
Whenever a frame is sent by the device, it duplicates the frame and appends
a trailer (Redundancy Control Trailer - RCT) which contains a sequence
number, LAN_Id (LAN_A=0xA or LAN_B=0xB) and an LSDU_size (Link Service Data
Unit, i.e, Ethernet payload size minus the RCT), and a PRP_Suffix (0x88FB)
for a total of 6 octets.
On reception, the sequence number is used to detect and discard the
duplicates, removes the RCT, and forwards the frames to the upper layers
The duplication on transmission, discarding and removal of RCT on reception
is done by the kernel module which acts as the Link Redundancy Entity (LRE)
as specified in the standard. The LRE is implemented as a virtual network
interface using a kernel module. The transmission is done by defining the
ndo_start_xmit function in the netdev_ops structure of my net_device.
The *problem* I have is on reception, specifically with the *removal of the
RCT*. The receive handler is registered on device creation using
netdev_rx_handler_register and netdev_upper_dev_link.
The receive handler when it detects that the RCT is well-formed (correct
PRP_Suffix, LSDU_size, LAN_ID for the NIC through which it was received),
tries to strip the RCT before calling netif_rx() on the skb to forward it
to the upper layers for processing.
To strip the RCT, I call *skb_trim* as follows:
skb_trim(skb, skb->len - PRP_RCTLEN /* 6 */);
as given in the HSR module.
I have used skb_dump both before and after the call to skb_trim and
verified that the length is being reduced and that the tailroom is
increased by 6 bytes. The problem is that when I call skb_trim, the packet
is not received by the upper layers. Without calling skb_trim, the packet
is received correctly but the RCT is consumed by the applications which
should not be the case.
I used wireshark to inspect the frames at both the sender and the receiver
side on the two physical devices and observed the following:
1. At the receiver side - The IP payload length is different for the
same frame received through LAN_A and LAN_B. The one received through LAN_B
has a payload length 6 greater than that for LAN_A (*6 is the size of
the RCT*). On checking the ip_rcv_core function, invalid IP payload
length is one reason that the packet can be dropped.
2. At the sender side - The entire packet is the same minus the RCT's
LAN_ID field for a frame-pair. The IP payload length is correct when I
capture outgoing frames on the two physical devices.
As mentioned at the top, the code is available at:
https://github.com/ramv33/prp. Here is a brief overview of the code
1. The transmission is defined in *prp_tx.c* in *prp_send_skb* which is
called by *prp_dev_xmit* function defined in *prp_dev.c*
2. The code that sets up the two slave devices to forward frames
received to the virtual interface is defined in *prp_dev.c* in the
function *prp_port_setup*. It is called by *prp_add_ports*, which is
called by *prp_dev_finalize* which is called by the RTNL newlnk callback,
3. The receive handler that is defined in *prp_rx.c, *the function is
*prp_recv_frame.* It checks if the frame has a valid RCT, duplicates
discards, and then strips the RCT by calling *strip_rct *before calling
*prp_net_if* which removes the Ethernet header and calls *netif_rx*
Hope you can help me find a solution to the removal of the RCT. Point out
any mistakes that I am making. Thank you in advance :)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Kernelnewbies