'DPDK forward received packets to default network stack

We're using DPDK (version 20.08 on ubuntu 20.04, c++ application) to receive UDP packets with a high throughput (>2 Mpps). We use a Mellanox ConnectX-5 NIC (and a Mellanox ConnectX-3 in an older system, would be great if the solution worked there aswell).

Contrary, since we only need to send a few configuration messages, we send messages through the default network stack. This way, we can use lots of readily available tools to send configuration messages; however, since all the received data is consumed by DPDK, these tools do not get back any messages.

The most prominent issue arises with ARP negotiation: the host tries to resolve addresses, the clients also do respond properly, however, these responses are all consumed by DPDK such that the host cannot resolve the addresses and refuses to send the actual UDP packets.

Our idea would be to filter out the high throughput packets on our application and somehow "forward" everything else (e.g. ARP responses) to the default network stack. Does DPDK have a built-in solution for that? I unfortunatelly coulnd't find anything in the examples.

I've recently heard about the packet function which allows to inject packets into SOCK_DGRAM sockets which may be a possible solution. I also couldn't find a sample implementation for our use-case, though. Any help is greatly appreciated.



Solution 1:[1]

Theoretically, if the NIC in question supports the embedded switch feature, it should be possible to intercept the packets of interest in the hardware and redirect them to a virtual function (VF) associated with the physical function (PF), with the PF itself receiving everything else.

  • The user configures SR-IOV feature on the NIC / host as well as virtualisation support;
  • For a given NIC PF, the user adds a VF and binds it to the corresponding Linux driver;
  • The DPDK application is run with the PF ethdev and a representor ethdev for the VF;
  • To handle the packets in question, the application adds the corresponding flow rules.

The PF (ethdev 0) and the VF representor (ethdev 1) have to be explicitly specified by the corresponding EAL argument in the application: -a [pci:dbdf],representor=vf0.

As for the flow rules, there should be a pair of such.

The first rule's components are as follows:

  • Attribute transfer (demands that matching packets be handled in the embedded switch);
  • Pattern item REPRESENTED_PORT with port_id = 0 (instructs the NIC to intercept packets coming to the embedded switch from the network port represented by the PF ethdev);
  • Pattern items matching on network headers (these provide narrower match criteria);
  • Action REPRESENTED_PORT with port_id = 1 (redirects packets to the VF).

In the second rule, item REPRESENTED_PORT has port_id = 1, and action REPRESENTED_PORT has port_id = 0 (that is, this rule is inverse). Everything else should remain the same.

It is important to note that some drivers do not support item REPRESENTED_PORT at the moment. Instead, they expect that the rules be added via the corresponding ethdevs. This way, for the provided example: the first rule goes to ethdev 0, the second one goes to ethdev 1.


As per the OP update, the adapter in question might indeed support the embedded switch feature. However, as noted above, item REPRESENTED_PORT might not be supported. The rules should be inserted via specific ethdevs. Also, one more attribute, ingress, might need to be specified.

In order to check whether this scheme works, one should be able to deploy a VF (as described above) and run testpmd with the aforementioned EAL argument. In the command line of the application, the two flow rules can be tested as follows:

  • flow create 0 ingress transfer pattern eth type is 0x0806 / end actions represented_port ethdev_port_id 1 / end
  • flow create 1 ingress transfer pattern eth type is 0x0806 / end actions represented_port ethdev_port_id 0 / end

Once done, that should pass ARP packets to the VF (thus, to the network interface) in question. The rest of packets should be seen by testpmd in active forwarding mode (start command).

NOTE: it is recommended to switch to the most recent DPDK release.

Solution 2:[2]

For the current use case, the best option is to make use of DPDK TAP PMD (which is part of LINUX DPDK). You can use Software or Hardware to filter the specific packets then sent it desired TAP interface.

A simple example to demonstrate the same would be making use DPDK skeleton example.

  1. build the DPDK example via cd [root folder]/example/skeleton; make static
  2. pass the desired Physical DPDK PMD NIC using DPDK eal options ./build/basicfwd -l 1 -w [pcie id of DPDK NIC] --vdev=net_tap0;iface=dpdkTap
  3. In second terminal execute ifconfig dpdkTap 0.0.0.0 promisc up
  4. Use tpcudmp to capture Ingress and Egress packets using tcpdump -eni dpdkTap -Q in and tcpdump -enu dpdkTap -Q out respectively.

Note: you can configure ip address, setup TC on dpdkTap. Also you can run your custom socket programs too. You do not need to invest time on TLDP, ANS, VPP as per your requirement you just need an mechanism to inject and receive packet from Kernel network stack.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Vipin Varghese