'Coursera URL web scraping

I have python code which can scrape coursera course details like course_title, ratings, number of students etc, But I want the course link as well. can someone help me for how to get each course URL from coursera.



Solution 1:[1]

I had a look at coursera.org and have find out the solution to scrape courses' URL too.

Here is what you want to do:

  1. Scrape all a elements with attribute data-click-key = search.search.click.search_card.
  2. Make a list of href of each element from the elements list.

Here is the code:

#Assume that you searched for python courses
base = "https://www.coursera.org"
titles = soup.find_all("h2", class_="card-title")
urls = soup.find_all("a", attrs={"data-click-key": "search.search.click.search_card"})
#Incase you need a list of URLs
url_list = [i['href'] for i in urls]

for title, url in zip(titles, urls):
    print(title.text + ": " + base + url['href'])
    
Output:

Python for Everybody: https://www.coursera.org/specializations/python
Python 3 Programming: https://www.coursera.org/specializations/python-3-programming
IBM Data Science: https://www.coursera.org/professional-certificates/ibm-data-science
Google IT Automation with Python: https://www.coursera.org/professional-certificates/google-it-automation
Applied Data Science with Python: https://www.coursera.org/specializations/data-science-python
Programming for Everybody (Getting Started with Python): https://www.coursera.org/learn/python
Crash Course on Python: https://www.coursera.org/learn/python-crash-course
Python for Data Science and AI: https://www.coursera.org/learn/python-for-applied-data-science-ai
Introducción a la programación en Python I: Aprendiendo a programar con Python: https://www.coursera.org/learn/aprendiendo-programar-python
Python Basics: https://www.coursera.org/learn/python-basics

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Just for fun