'Extract part of url with dynamic form
I need to extract part of strings that represent urls.Its from an api response and I need to get specific part(its called ASIN)
Examples
For me to extract the ASIN number, its after the /dp
and before /ref
part of the url string
print(f"asin {url.split('/')[-2]}")
prints B091JJZPCM,B07P3CTC3Z
But often times, some urls has different pattern, like
In this case, how do you extract the ASIN part?
Solution 1:[1]
Solved it by
url = "https://amazon.com/**********************"
asin = url.split("/")[-2]
if "html" in asin:
print(f"url {asin}")
parsed_url = urlparse(url)
captured_value = parse_qs(parsed_url.query)["url"][0]
url = f"https://amazon.com{captured_value}"
print(f"captured url {url}")
asin = url.split("/")[-2]
print(f"captured asin {asin}")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | ira |