'trying to get all the external links on a website with importxml

I would really appreciate some advice here,

this should work, but it doesn't:

=IMPORTXML("www.michaelcropper.co.uk",
 "//a[not(contains(@href,‘www.michaelcropper.co.uk’))]/@href","en_US")

basiclly this xpath query isnt right:

//a[not(contains(@href, example.com))]/@href 

and I really can't figure out why, any suggestion?

I have tried writing this in different ways and it did not help and also changed google sheets' location.



Solution 1:[1]

Try

=importxml(url,"//@href[not(contains(.,'michaelcropper.co.uk'))]")

or with url in parameter

=transpose(importxml("https://www."&B1,"//@href[not(contains(.,'"&B1&"'))]"))

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1