'Linux UDP ephemeral port bind occasionally fails to receive

I am writing a set of tests that depend on two applications communicating (locally) with a UDP socket. These UDP sockets are originally bound using port 0, then the actual bound port is queried using getsockname and shared between the two applications. Occasionally, the subsequent recv calls fail to return any data without any error reporting during binding etc.

I've simplified my implementation to a bare-bones test an included it below. It fails with "Received failed, got -1 expeccted 6" about 1 of 10 runs. What am I missing in the socket setup to reliable use an ephemeral, OS-assigned port?

#include <arpa/inet.h>
#include <errno.h>
#include <netdb.h>
#include <sys/socket.h>
#include <unistd.h>

#include <cstring>
#include <thread>

int32_t createSocket(uint16_t& rBoundPort)
{
  int32_t s = socket(AF_INET, SOCK_DGRAM, 0);

  if (s >= 0)
  {
    timeval timeout; 
    timeout.tv_sec = 1;
    timeout.tv_usec = 0;
    
    if (setsockopt(s, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(timeval)) < 0)
    { 
      ::close(s);
      return -1;
    }

    int32_t r(1);
    if (setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &r, sizeof(int)) < 0)
    { 
      close(s);
      return -1;
    }

    int32_t bufSize(50000);
    if (setsockopt(s, SOL_SOCKET, SO_RCVBUF, &bufSize, sizeof(bufSize)) < 0)
    { 
      close(s);
      return -1;
    }

    // Setup local listening port
    sockaddr_in listenAddress;
    memset(reinterpret_cast<char*>(&listenAddress), 0, sizeof(listenAddress));
    listenAddress.sin_family = AF_INET;
    inet_aton("127.0.0.1", &listenAddress.sin_addr);
    listenAddress.sin_port = htons(0);
    
    if (bind(s, (struct sockaddr*)&listenAddress,  sizeof(listenAddress)) != 0)
    { 
      close(s);
      return -1;
    }

    // Update the bound listen port
    socklen_t boundAddrLen = sizeof(sockaddr_in);
    getsockname(s, (struct sockaddr*)&listenAddress, &boundAddrLen);
    rBoundPort = ntohs(listenAddress.sin_port);
  }

  return s;
}

void mysockettest(int32_t s, uint16_t destPort)
{
  const int32_t dataSize = 6;
  char aWrite[dataSize] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05 };
  char aRx[dataSize];

  sockaddr_in dest;
  memset((char*)&dest, 0, sizeof(dest));
  dest.sin_family = AF_INET;
  inet_aton("127.0.0.1", &dest.sin_addr);
  dest.sin_port = htons(destPort);

  sendto(s, aWrite, dataSize, 0, (struct sockaddr*)&dest, sizeof(sockaddr_in));

  sockaddr_in src;
  socklen_t srcSize = sizeof(sockaddr_in);
  memset((char*)&src, 0, srcSize);

  int32_t bytesReceived = recvfrom(s, aRx, dataSize, 0, (struct sockaddr*)&src, &srcSize);
  if (bytesReceived != dataSize)
  {
    printf("Received failed, got %d expected %d\n", bytesReceived, dataSize);
  }
}

int main(int argc, char** pargv)
{
  uint16_t s1Port(0);
  int32_t s1 = createSocket(s1Port);
  if (s1 < 0)
  {
    printf("FAILED TO OPEN SOCKET 1\n");
    return -1;
  }
  if (s1Port < 1)
  {
    printf("FAILED TO BIND SOCKET 1 TO PORT\n");
    return -1;
  }

  uint16_t s2Port(0);
  int32_t s2 = createSocket(s2Port);
  if (s2 < 0)
  {
    printf("FAILED TOOPEN SOCKE 2\n");
  }
  if (s2Port < 1)
  {
    printf("FAILED TO BIND SOCKET 2 TO PORT\n");
  }

  std::thread t1(mysockettest, s1, s2Port);
  std::thread t2(mysockettest, s2, s1Port);

  t1.join();
  t2.join();

  close(s1);
  close(s2);
}
```


Solution 1:[1]

Unforntunately I can't explain the details of they "why" (perhaps someone else can), but I found that adding a global mutex lock to protect the send corrected the instability on the send. The global lock isn't practical for my situation, so instead I implemeneted a simply retry loop on the send (if send() == -1, retry 10 times). While this solution is not ideal, it has proven stable.

Once that stability was resolved, I also saw occasional failures on the read side. With SO_REUSEADDR and binding with port 0 (OS-assignment), it was possible to have both sockets be assigned and bind to the same port, so I removed the SO_REUSEADDR when binding to port 0.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 T. Waters