'AWS Lambda running tensorflow packages exceed limit 250 MB

I need to do a segmentation prediction for a tensorflow-keras model. My idea was to upload an image into an S3 bucket, and with the AWS Lambda service, trigger the process, do the prediction and save the segmented predicted mask into a new S3 bucket.

At the beginning, I created some layers with the below libraries, to run the model:

  • tensorflow 2.0.0b1
  • keras 2.3.1
  • opencv
  • segmentation-models
  • boto3

The problem is that the tensorflow version (2.0.0b1) already exceeds the size limit (250 MB) for the service. So, when I try to upload the files, a size error appears.

I think I can rewrite my code to avoid the segmentation-models use, and maybe only use tensorflow.keras instead of Keras, but I still need tensorflow, opencv and boto to connect with the servers.

Does anyone know how I should proceed?



Solution 1:[1]

UPDATE: AWS Lambda now provides container deployment support, allowing for a 10GB Lambda, which would make it possible to run larger workloads on AWS Lambda, AWS Lambda has also increased ephemeral storage to 10GB.

The hard limit for AWS Lambda of 250MB still exists for AWS Lambda deployment packages that are not container deployment. This quota applies to all the files you upload, including layers and custom runtimes.

With AWS Lambda you can configure the amount of memory and how long the process runs for. The amount of memory available to the function during execution is between 128 MB and 3,008 MB in 64-MB increments. AWS Lambda allocates CPU power linearly in proportion to the amount of memory configured. At 1,792 MB, a function has the equivalent of one full vCPU (one vCPU-second of credits per second). Meaning AWS Lambda is not designed for CPU intensive workloads. Furthermore AWS Lambda does not support GPUs at this time. There are other limits to AWS Lambda, like the number of concurrent jobs. For a complete list of limits please visit this page.

Have you looked at Amazon SageMaker for your ML endpoints? With Amazon SageMaker you can specify the type of instance you want train on, or the instance you want to do inference on. With Amazon SageMaker you have access to the AWS Inferentia chips and others. Amazon SageMaker has limits too which are available here.

You can also deploy an inference pipeline.

In your use case you can, deploy your model to an Amazon SageMaker endpoint. When an image enters the bucket trigger an AWS Lambda function that calls the Amazon SageMaker endpoint, gets the result back and continues processing as you require. Note that the maximum run time of a Lambda function is 15min. You can use AWS Step Functions to invoke the Amazon SageMaker endpoint and then call another Lambda Function. AWS Step Functions allow you to decouple the AWS Lambda function from the call. Decreasing the runtime cost of the Lambda, and allowing long running jobs. To decrease the cost even more you can use AWS Step Functions Express Workflow.

More info how to build, test, and deploy your models using Amazon SageMaker. Here is a sample implementation that does what you are asking for. This example triggers an AWS Lambda function as soon as a file is dropped in a bucket. The AWS Lambda function starts a Step Function process that invokes Amazon SageMaker. The results update a table and send an email.

This article explains how to deploy a trained Keras or TensorFlow model using Amazon SageMaker.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1