pcie dma transfer
Greg KH
greg at kroah.com
Mon Jun 4 08:05:05 EDT 2018
On Mon, Jun 04, 2018 at 01:12:48PM +0200, Christoph Böhmwalder wrote:
> Hi,
>
> I'm not sure how on-topic this is on this list, but I have a question
> regarding a device driver design issue.
>
> For our Bachelor's project my team and I are tasked to optimize an
> existing hardware solution. The design utilizes an FPGA to accomplish
> various tasks, including a Triple Speed Ethernet controller that is linked to
> the CPU via PCI express. Currently the implementation is fairly naive,
> and the driver just does byte-by-byte reads directly from a FIFO on the
> FPGA device. This, of course, is quite resource intensive and basically
> hogs up the CPU completely (leading throughput to peak at around
> 10 Mbit/s).
>
> Our plan to solve this problem is as follows:
>
> * Keep a buffer on the FPGA that retains a number of Ethernet packages.
> * Once a certain threshold is reached (or a period of time, e.g. 5ms, elapses),
> the buffer is flushed and sent directly to RAM via DMA.
> * When the buffer is flushed and the data is in RAM and accessible by
> the CPU, the device raises an interrupt, signalling the CPU to read
> the data.
The problem in this design might happen right here. What happens
in the device between the interrupt being signaled, and the data being
copied out of the buffer? Where do new packets go to? How does the
device know it is "safe" to write new data to that memory? That extra
housekeeping in the hardware gets very complex very quickly.
> * In the interrupt handler, we `memcpy` the individual packets to
> another buffer and hand them to the upper layer in the network stack.
This all might work, if you have multiple buffers, as that is how some
drivers work. Look at how the XHCI design is specified. The spec is
open, and it gives you a very good description of how a relativly
high-speed PCIe device should work, with buffer management and the like.
You can probably use a lot of that type of design for your new work and
make things run a lot faster than what you currently have.
You also have access to loads of very high-speed drivers in Linux today,
to get design examples from. Look at the networking driver of the
10, 40, and 100Gb cards, as well as the infiband drivers, and even some
of the PCIe flash block drivers. Look at what the NVME spec says for
how those types of high-speed storage devices should be designed for
other examples.
best of luck!
greg k-h
More information about the Kernelnewbies
mailing list