'Scraping New YouTube Videos With BeautifulSoup

I'm new to python and I'm wanting to get into web scraping on YouTube. I'm wanting to use this link to get the newest videos uploaded: 'https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D' and I want to scrape the new 5 videos. How can I do this? I've used this piece of code to test it (I only want the links) from this question

from bs4 import BeautifulSoup
import requests

url="https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D"
html = requests.get(url)
soup = BeautifulSoup(html.text, features="html.parser") 

for entry in soup.find_all("entry"):
    for link in entry.find_all("link"):
        print(link["href"])

Edit: I don't get any response from the python terminal. It's not scraping anything. It only has the default ">>>".



Solution 1:[1]

You can scrape YouTube by:

  • using requests-HTML, playwright or selenium libraries.
  • using regular expression.
  • using YouTube Search Engine Results API from SerpApi.

Code (it's really basic just to give an idea)

from requests_html import HTMLSession

session = HTMLSession()
url = "https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D"
response = session.get(url)
response.html.render(sleep=1, keep_page = True, scrolldown = 2)

for links in response.html.find('a#video-title'):
    link = next(iter(links.absolute_links))
    print(link)

Output:

https://www.youtube.com/watch?v=OUnxJk3Bphk
https://www.youtube.com/watch?v=vWvtt1ESNeY
https://www.youtube.com/watch?v=b8OIZu5y_Ak
https://www.youtube.com/watch?v=xp3fHaT2_VE
https://www.youtube.com/watch?v=e9toQAcjOrw
https://www.youtube.com/watch?v=em0Is0nyaXA
https://www.youtube.com/watch?v=N5JVTUAGmAM
https://www.youtube.com/watch?v=a0hQG-UdhYc
https://www.youtube.com/watch?v=SmQFxQ1fa2o
https://www.youtube.com/watch?v=uuMS1FYLgWQ
https://www.youtube.com/watch?v=8WJ-zSE32ZY
https://www.youtube.com/watch?v=c5MtH-xDspg
https://www.youtube.com/watch?v=5Xktqz6VUTU
https://www.youtube.com/watch?v=Wbo6j_iq2XY
https://www.youtube.com/watch?v=8eu9nliySO4
https://www.youtube.com/watch?v=j28PjOy_uk8
https://www.youtube.com/watch?v=fM2Ordt8Q9E
https://www.youtube.com/watch?v=tFSkaIVyNno
https://www.youtube.com/watch?v=1hDXlc2C3Rw
https://www.youtube.com/watch?v=vH9_Eo7VW3c

Using regex without a headless browser.

You need to reach the var ytInitialData element and then "commandMetadata" where you'll find a URL for the video {"url":"/watch?v=Ae2TRkpjRCc",....

Here's a starting point where it grabs all inside var ytInitialData on regex101.


Alternatively, you can use YouTube Search Engine Results API from SerpApi. It's a paid API with a free plan. Check out the Playground.

Code to integrate:

from serpapi import GoogleSearch

params = {
  "engine": "youtube",
  "search_query": "programming",
  "sp": "CAISBAgBEAE%253D",
  "api_key": "your_secret_api_key"
}

search = GoogleSearch(params)
results = search.get_dict()

for link in results['video_results']:
    print(f"Title: {link['title']}\nLink: {link['link']}\n")

Output:

Title: CLASS VIII BASIC HTML TAGS AND PROGRAMMING 15 4 101`
Link: https://www.youtube.com/watch?v=KIPp63tXKpU

Title: For loop in c programming #bssdlectureclasses
Link: https://www.youtube.com/watch?v=nfRN0x9VvQc

Title: [C#] Programming NatsukiBot
Link: https://www.youtube.com/watch?v=chnigx-ezwg

Title: CS201 Short Lecture - 03 | VU Short Lecture | Introduction to Programming in (Urdu / Hindi)
Link: https://www.youtube.com/watch?v=qoxXJchd7N4

Title: Programming in C Language - While statement
Link: https://www.youtube.com/watch?v=cl0OpNCdF5I

Title: Introduction to html and Basic programming
Link: https://www.youtube.com/watch?v=A4We3NGqxuA

Title: Use of Printf & Scanf functions | Part 7 | C Programming | PadhoChalo
Link: https://www.youtube.com/watch?v=578xS-Ugc2c

Title: C++ course has started | Computer Programming | Aashu |
Link: https://www.youtube.com/watch?v=SjFgTK2HqbE

Title: Mitsubishi Outlander 2008 prox/twist transponder key programming tip
Link: https://www.youtube.com/watch?v=HlSJcBwxKFQ

Title: Computer Programming 1 -Introduction to the course
Link: https://www.youtube.com/watch?v=xdmPbhTT01g

Title: Programming, Data Structures and Algorithms in Python
Link: https://www.youtube.com/watch?v=0fUddu9cdAU

P.S - I wrote two blog posts about how to Scrape YouTube Search with Python (part 1) and Scrape YouTube Search with Python (part 2) that cover it more in-depth with visual representation.

Disclaimer, I work for SerpApi.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1