'Unable to run script/package locally on Docker
I'm creating a webscraper package and am in the process of uploading to Docker. Whilst I can build to the local Docker repository, I cannot run the script without the following errors appearing:
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
Here is what I have in the main script so far to try and get it running on Docker:
def __init__(self, url: str = " url goes here ",
options: Optional[ChromeOptions] = None): #default url
options = ChromeOptions()
self.driver = Chrome(ChromeDriverManager().install(), options=options)
options.add_argument("--no-sandbox")
options.binary_location = '/usr/bin/google-chrome'
options.add_argument("--headless")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument('--disable-gpu')
options.add_argument("window-size=1920,1080")
From other posts I note that for some it was as simple as changing the order of options.add_argument, which I have tried but found it doesn't work for me.
I also have the following modules within the same script:
import os
import selenium
from selenium.webdriver import Chrome
from webdriver_manager.chrome import ChromeDriverManager #installs Chrome webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException
from selenium.webdriver import ChromeOptions
from selenium.webdriver.chrome.service import Service
from typing import Optional
import time
import boto3
from sqlalchemy import create_engine
import urllib.request
import tempfile #temporary directory - to be removed after all operations have finished
In my Dockerfile:
FROM python:3.8
#Set Chrome Repo
RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -\
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list'\
&& apt-get -y update\
#Install Chrome
&& apt-get install -y google-chrome-stable\
&& wget -O /tmp/chromedriver.zip http://chromedriver.storage.googleapis.com/`curl -sS chromedriver.storage.googleapis.com/LATEST_RELEASE`/chromedriver_linux64.zip\
&& apt-get install -yqq unzip\
&& unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
COPY . .
RUN pip install -r requirements.txt
#When we run the container, this will be the command run
CMD ["python", "scraper/webscraper.py"]
Just in case, I am using Docker and VSCode on Windows OS.
Solution 1:[1]
I've realised that by changing the order of options.add_arguments and self.driver, the script will run just fine. This is because the driver is being created first when it should be the other way around as follows:
def __init__(self, url: str = " url goes here ",
options: Optional[ChromeOptions] = None): #default url
options = ChromeOptions()
options.add_argument("--no-sandbox")
options.binary_location = '/usr/bin/google-chrome'
options.add_argument("--headless")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-setuid-sandbox")
options.add_argument("--remote-debugging-port=9222")
options.add_argument("start-maximized")
options.add_argument('--disable-gpu')
options.add_argument("window-size=1920,1080")
self.driver = Chrome(ChromeDriverManager().install(), options=options)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | zp24 |
