'memcpy Performance Varies Based on Whether the Destination Pointer Changes During the Loop

So I am working with an FPGA card to capture data coming off the network, and I am having problems copying the data to memory quickly enough. I use mempcy to copy the bytes from one location to another, but when I have it configured to increment through the available space I get an error saying that I'm not copying data out of the source quickly enough and it is getting overwritten by incoming data.

However, if I set the destination to be a static value (either an absolute number or a variable that does not change), I do not get the 'not copying fast enough' error. This is useless since I've been basically overwriting the data in the destination the entire time, but memcpy works fast enough.

Code sample (elements from vendor abstracted):

unsigned char *destinationBuffer = new unsigned char[500000000];
unsigned int destBuffer_current_size = 0;

while (!RingBuffer->IsEmpty()){
     //abstraction - get pointer to next unit to be copied from source buffer
     bytes = (char*)unitToBeCopied

     memcpy((void*)&destinationBuffer[destBuffer_current_size], (void*)bytes, 4096);
     destBuffer_current_size += 4096;

Things that have allowed memcpy to work efficiently enough but are not viable options:

Set the pointer to destinationBuffer[0]

memcpy((void*)&destinationBuffer[0], (void*)bytes, 4096);

Set the pointer to destinationBuffer[490000000] - means that accessing portions of the array that are significant distances from the head is not the problem

memcpy((void*)&destinationBuffer[490000000], (void*)bytes, 4096);

Use an int for the destination pointer, but do not increment it - means that evaluating a variable is not the bottleneck

memcpy((void*)&destinationBuffer[destBuffer_current_size], (void*)bytes, 4096);
destBuffer_current_size += 0; //ie destBuffer_current_size does not increment

Use an int that does not increment for the destination pointer, but add another int that does increment to the loop - means that the incrementor is not the bottleneck

 memcpy((void*)&destinationBuffer[destBuffer_current_size], (void*)bytes, 4096);
 second_destBuffer_current_size += 4096; //ie destBuffer_current_size does not increment, but a second unsigned int does, to eliminate the incrementing as the root cause

Things that have not worked:

  • Changing the buffer size
  • Changing (void*)&var[] to just &var[]

Why would not incrementing the destination pointer in memcpy make it work more efficiently, and is there a way to make it work while incrementing it?

Note: Yes, it would be faster if I could just swap pointers and be done with it. The source data is in a section of memory where it will get overwritten if not copied quickly enough.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source