'What form should the output layer of a deep learning network look like for multi-object bounding box regression?

I am building a neural network on the back of Mobilenet SSD v2 and its specifically for bounding box regression. I have had a difficult time looking for clear resources indicating how the output of the model should be shaped. My data generally has 1-4 boxes present in any given image and I could simply concatenate so the output is Dense(16) but what about the instance when there are more than 4 objects present in the image. I am unsure how to handle a dynamic multi-object output layer, how can I do this, are there any detailed resources that can be shared?

tensorflow deep-learning

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'What form should the output layer of a deep learning network look like for multi-object bounding box regression?

Sources

Related Questions