'Flutter get all images from website with given URL

I am trying to scrape any website for its images and save them in a list. For that I am using the getElementsByTagname("img") and also selected the ['src'] attributes like this:

  void _getData() async {
    final response = await http.get(Uri.parse(_currentUrl));
    final host = Uri.parse(_currentUrl).host;
    dom.Document document = parser.parse(response.body);
    final elements = document.getElementsByTagName("img").toList();
    for (var element in elements) {
      var imageSource = element.attributes['src'] ?? '';
      print(imageSource);
      bool validURL = Uri.parse(imageSource).host == '' ||
              Uri.parse(host + imageSource).host == ''
          ? false
          : true;

      if (validURL && !imageSource.endsWith('svg')) {
        Uri imageSourceUrl = Uri.parse(imageSource);
        if (imageSourceUrl.host.isEmpty) {
          imageSource = host + imageSource;
        }

        if (_imagesWithSize.firstWhereOrNull(
              (element) => element.imageUrl == imageSource,
            ) ==
            null) {
          Size size = await _calculateImageDimension(imageSource);
          _imagesWithSize.add(
            ImageWithSize(
              imageSource,
              size,
            ),
          );
        }
      }
    }
    _imagesWithSize.sort(
      (a, b) => (b.imageSize.height * b.imageSize.width).compareTo(
        a.imageSize.height * a.imageSize.width,
      ),
    );
  }

Problem:

This does not work with this link:

HM Productlink

I get this URL:

//lp2.hm.com/hmgoepprod?set=quality%5B79%5D%2Csource%5B%2F0c%2Fe6%2F0ce67f87aa6691557f30371590cf854ed0fb77c7.jpg%5D%2Corigin%5Bdam%5D%2Ccategory%5B%5D%2Ctype%5BLOOKBOOK%5D%2Cres%5Bm%5D%2Chmver%5B1%5D&call=url[file:/product/main]

And this is not a valid URL...

How can I parse the image from this website?

Let me know if you need any more info!



Solution 1:[1]

Links with leading double slashes are valid in HTML, as part of RFC1808. They will be replaced by http or https depending on the context. It will probably work if you add the scheme (http: or https:) from _currentUrl to imageSourceUrl.

I have not tested this, but I assume something like this would work:

if (!imageSourceUrl.hasScheme) {
  final scheme = Uri.parse(_currentUrl).scheme;
  imageSourceUrl = imageSourceUrl.replace(scheme: scheme);
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1