'Running GPU accelerated apps in a Docker container on AWS Elastic Beanstalk

How can I set it up to be able to run GPU accelerated apps in a Docker container on AWS Elastic Beanstalk?



Solution 1:[1]

Using Deep Learning AMI with Beanstalk may not be able to use some Beanstalk features like health monitoring, sqsd with worker env. In my case, I built a custom AMI from Beanstalk docker AMI and deployed the application using docker-compose with GPU enabled. Details are as follows:

  1. check Beanstalk docker AMI in Beanstalk default config
  2. start an EC2 instance with Beanstalk docker AMI on GPU instance type (ex: g4dn.xlarge)
  3. Install Nvidia driver (required S3 access) https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/install-nvidia-driver.html
sudo yum install -y gcc kernel-devel-$(uname -r)
aws s3 cp --recursive s3://nvidia-gaming/linux/latest/ .
chmod +x NVIDIA-Linux-x86_64*.run
sudo ./NVIDIA-Linux-x86_64*.run
nvidia-smi
  1. Install NVIDIA Container Toolkit (DONOT INSTALL DOCKER - docker will be installed by Beanstalk) https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installing-on-amazon-linux
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo
sudo yum clean expire-cache
sudo yum install nvidia-docker2
  1. Stop instance and create a AMI from that instance
  2. Set your AMI ID to Beanstalk config
  3. Deploy docker container with docker-compose.yaml (https://docs.docker.com/compose/gpu-support/)
services:
  test:
    image: nvidia/cuda:10.2-base
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]

Note: For beanstalk worker (https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create_deploy_docker.container.console.html)

if you manage your Docker environment with Docker Compose, Elastic Beanstalk assumes that you run a proxy server as a container. Therefore it defaults to None for the Proxy server setting, and Elastic Beanstalk does not provide an NGINX configuration

Solution 2:[2]

This answer was posted by Piotr Walczyszyn (under the CC BY-SA 3.0 license) in the original revision of the question Running GPU accelerated apps in a Docker container on AWS Elastic Beanstalk. Reposted here to conform to SO's Q&A format.


This is how I managed to run GPU accelerated apps in a Docker container on AWS Elastic Beanstalk. It took me almost two days of trial and error runs, so I hope others will not waste time having this post here.

These are the settings of my Elastic Beanstalk environment:

  1. Set AMI ID to one those with the NVIDIA drivers and Docker preconfigured. I used Deep Learning AMI (Amazon Linux) Version 25.3 - ami-068d6d02d8775ec52
  2. Set Instance type to one of the EC2 instances with GPUs. I used g3s.xlarge as it seamed enough for what I needed to do.
  3. In your app Dockerfile add following lines: ENV NVIDIA_VISIBLE_DEVICES all ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
  4. In your app root folder create .ebextensions/01-nvidia-docker.config file and paste following code there: ```yaml option_settings: aws:autoscaling:launchconfiguration:

    I had to increase storage space for the docker to run my container

    BlockDeviceMappings: /dev/xvdcz=:24:true

commands: configure_docker_run: # I had to update EB startup script so it passes --runtime=nvidia when running docker command: sed -i 's/docker run -d/docker run --runtime=nvidia -d/' /opt/elasticbeanstalk/hooks/appdeploy/enact/00run.sh # Following line will be required if docker is updated to 19.1 # command: sed -i 's/docker run --gpus all -d/docker run -d/' /opt/elasticbeanstalk/hooks/appdeploy/enact/00run.sh ```

Now you can enjoy Docker container running on Elastic Beanstalk with NVIDIA GPU enabled.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cigien
Solution 2