'How do scrape big images from navigation?
I am trying scrape big versions of images in navigation from "https://www.akrapovic.com/en/car/product/16722/Ferrari/488-GTB-488-Spider/Slip-On-Line-Titanium?brandId=20&modelId=785&yearId=5447". Unfortunately, my code only gets those tiny images.
collected_HTML_Tag = driver.find_element(By.XPATH,"//nav/ul/li[1]/a/img").get_attribute('src')
print(collected_HTML_Tag)
How can I improve my code to get above mentioned images ?
For better understanding example consider following images. I need images marked in red.
Solution 1:[1]
You can download all 4 images using below code:
from selenium import webdriver
import time
# Function of downloading Image in current folder
def downloadImage(driver, j):
with open(f'Image{j}.jpg', 'wb') as file:
l = driver.find_element_by_xpath(
'/html/body/ak-app/div[1]/abstract/products/section/product-details/section[1]/div[2]/div/div[2]/div[2]/div/img')
file.write(l.screenshot_as_png)
driver = webdriver.Chrome()
driver.implicitly_wait(5)
driver.maximize_window()
driver.get('https://www.akrapovic.com/en/car/product/16722/Ferrari/488-GTB-488-Spider/Slip-On-Line-Titanium?brandId=20&modelId=785&yearId=5447')
# Click Next Image
for j in range(1, 5):
print(j)
if j != 1:
driver.find_element_by_xpath(
f'/html/body/ak-app/div[1]/abstract/products/section/product-details/section[1]/div[2]/div/div[2]/div[2]/nav/ul/li[{j}]/a/img').click()
time.sleep(5)
downloadImage(driver, j)
Solution 2:[2]
You don't need to use Selenium, you can get these image_urls in less than a second just using requests, there is a backend api request that you can recreate from the given url:
import requests
given_url = 'https://www.akrapovic.com/en/car/product/16722/Ferrari/488-GTB-488-Spider/Slip-On-Line-Titanium?brandId=20&modelId=785&yearId=5447'
code = given_url.split('/')[6] #get car code
suffix = given_url.split('?')[-1] #get endpart of url
headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36'}
url = f'https://www.akrapovic.com/api2/en-US/products/car/{code}?$inlinecount=allpages&{suffix}'
resp = requests.get(url,headers=headers).json()
images = [x['Image'] for x in resp['ProductImages']]
print(images)
That api url can be found in your browser's Developer Tools - Network tab - fetch/Xhr (then refresh the page and you'll see it load)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Devam Sanghvi |
| Solution 2 | bushcat69 |


