'CoreML performance cost of Vision Framework vs. CVPixelBuffer?

There are two options for passing images into CoreML models:

  1. Pass in a CGImage to Vision Framework with its niceties
  2. Pass in a CVPixelBuffer directly

Is there any data on the memory and processing overhead associated with using the Vision framework vs. passing a CVPixelBuffer directly to CoreML?

Thoughts based on what I've seen while debugging:

Memory

Assuming we already have the data in a CVPixelBuffer, creating the CGImage to pass to Vision seems to double the memory usage. It looks like Vision is creating a new object in CoreVideo/CoreImage from createPixelBufferFromVNImageBuffer which makes sense as it needs to create a copy of the image to crop/rotate/scale.

Processing

You're going to have to do the rotation and/or scaling either way. I'd assume Vision does those at least as efficiently as you could do by hand with Accelerate. So should not be any overhead here.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source