'Linux flow steer traffic from network interface to specific core?

My system is a Ubuntu 21.04, kernel 5.11.0-49-generic. I have 16 cores (HT 32). Network driver is Intel i40e. Note: I am using SRIOV VF's for my interfaces. Be aware through all my testing I am generating ~14mpps to a 10Gbps interface, so I am flooding the interface to make sure ksoftirq usage gets peeked

I want to be able to force all traffic from a nic to a specific core.

I used to be able to accomplish this with ethtool's flow steering mechanism. Example command would be the following to flow steer traffic to core 2 that matches the criteria below:

ethtool -N enp94s0f2 flow-type udp4 dst-ip 10.10.10.100 dst-port 1234 action 2

But this only works on NIC PF's. I cannot do this with SRIOV vfs, driver does not support it.

I have already read up on RSS, RPS, and RFS from here and here.

I cant find much information on how to configure RSS, it seems to be a hardware based way of flow steering. Not sure my NIC supports it nor do I know how to do anything with it. I dont think I want RFS as it seems to be a dynamic way of flow steering your RPS rules, for cores that most available.

This leaves me with using RPS / adjust the number of RX queues. Currently I have been able to force all traffic to one core by running the following

ethtool -L enp94s0f2v1 combined 1

Then making sure rps_cpus is defaulted to off (00000000) for rps_cpus

$cat /sys/class/net/enp9s0f2v1/queues/rx-0/rps_cpus
00000000

In this example, after doing all this, I can see from top and cat /proc/softirqs that core 0 is getting all of the traffic.

I already understand based on NUMA that the NIC will only use specific cpu cores in designated NUMA nodes, so I think I can only use a core listed in numa_node 0.

$cat /sys/class/net/enp94s0f2v1/device/numa_node
0
$cat /sys/devices/system/node/node0/cpulist
0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30

But how can I force traffic to another core. Example "steer all traffic processing to core 10 only." Also is it possible force the traffic to a different NUMA node core too?

I tried messing with rps_cpus, but I am still having trouble understanding how this file works. I think what value I put in here is an aggregate of which cores I want to use. Is there a way to pinpoint rps_cpus to use a specific core?

First I have seen f being used a lot in guides. Setting the value to 00001111 or in hex just f will force Core 0,1,2 & 3 to be used even when there is only one RX queue.

echo "f" > /sys/class/net/enp9s0f2v1/queues/rx-0/rps_cpus

However testing more I found the following:

With only one RX queue enabled, setting rps_cpus to 00000001 will only use CORE 0.

echo "00000001" > /sys/class/net/enp9s0f2v1/queues/rx-0/rps_cpus

What is odd is the second I set the above value, another core starts having lots of ksoftirq usage as well. So in the example above, CORE 0 is still being used, but CORE 8 starts getting lots CPU usage as well.

Different test, with only one RX queue enabled, setting rps_cpus to 00000010 will force Core 4 to be used + CORE 8.

echo "00000010" > /sys/class/net/enp9s0f2v1/queues/rx-0/rps_cpus

So I get it that I can move traffic to a specific core. It seems I cant use the HT core, and I am limited to the cores on the specific NUMA node the interface is using.

But why when I adjust rps_cpus with only one RX queue do I see another core being used a lot? Is it because rps_cpus requires some type of hashing to be performed to steer the traffic?

Is there some other way to steer traffic to specific cores I am not aware of?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source