'Scraping New YouTube Videos With BeautifulSoup
I'm new to python and I'm wanting to get into web scraping on YouTube. I'm wanting to use this link to get the newest videos uploaded: 'https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D' and I want to scrape the new 5 videos. How can I do this? I've used this piece of code to test it (I only want the links) from this question
from bs4 import BeautifulSoup
import requests
url="https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D"
html = requests.get(url)
soup = BeautifulSoup(html.text, features="html.parser")
for entry in soup.find_all("entry"):
for link in entry.find_all("link"):
print(link["href"])
Edit: I don't get any response from the python terminal. It's not scraping anything. It only has the default ">>>".
Solution 1:[1]
You can scrape YouTube by:
- using
requests-HTML,playwrightorseleniumlibraries. - using regular expression.
- using YouTube Search Engine Results API from SerpApi.
Code (it's really basic just to give an idea)
from requests_html import HTMLSession
session = HTMLSession()
url = "https://www.youtube.com/results?search_query=programming&sp=CAISBAgBEAE%253D"
response = session.get(url)
response.html.render(sleep=1, keep_page = True, scrolldown = 2)
for links in response.html.find('a#video-title'):
link = next(iter(links.absolute_links))
print(link)
Output:
https://www.youtube.com/watch?v=OUnxJk3Bphk
https://www.youtube.com/watch?v=vWvtt1ESNeY
https://www.youtube.com/watch?v=b8OIZu5y_Ak
https://www.youtube.com/watch?v=xp3fHaT2_VE
https://www.youtube.com/watch?v=e9toQAcjOrw
https://www.youtube.com/watch?v=em0Is0nyaXA
https://www.youtube.com/watch?v=N5JVTUAGmAM
https://www.youtube.com/watch?v=a0hQG-UdhYc
https://www.youtube.com/watch?v=SmQFxQ1fa2o
https://www.youtube.com/watch?v=uuMS1FYLgWQ
https://www.youtube.com/watch?v=8WJ-zSE32ZY
https://www.youtube.com/watch?v=c5MtH-xDspg
https://www.youtube.com/watch?v=5Xktqz6VUTU
https://www.youtube.com/watch?v=Wbo6j_iq2XY
https://www.youtube.com/watch?v=8eu9nliySO4
https://www.youtube.com/watch?v=j28PjOy_uk8
https://www.youtube.com/watch?v=fM2Ordt8Q9E
https://www.youtube.com/watch?v=tFSkaIVyNno
https://www.youtube.com/watch?v=1hDXlc2C3Rw
https://www.youtube.com/watch?v=vH9_Eo7VW3c
Using regex without a headless browser.
You need to reach the var ytInitialData element and then "commandMetadata" where you'll find a URL for the video {"url":"/watch?v=Ae2TRkpjRCc",....
Here's a starting point where it grabs all inside var ytInitialData on regex101.
Alternatively, you can use YouTube Search Engine Results API from SerpApi. It's a paid API with a free plan. Check out the Playground.
Code to integrate:
from serpapi import GoogleSearch
params = {
"engine": "youtube",
"search_query": "programming",
"sp": "CAISBAgBEAE%253D",
"api_key": "your_secret_api_key"
}
search = GoogleSearch(params)
results = search.get_dict()
for link in results['video_results']:
print(f"Title: {link['title']}\nLink: {link['link']}\n")
Output:
Title: CLASS VIII BASIC HTML TAGS AND PROGRAMMING 15 4 101`
Link: https://www.youtube.com/watch?v=KIPp63tXKpU
Title: For loop in c programming #bssdlectureclasses
Link: https://www.youtube.com/watch?v=nfRN0x9VvQc
Title: [C#] Programming NatsukiBot
Link: https://www.youtube.com/watch?v=chnigx-ezwg
Title: CS201 Short Lecture - 03 | VU Short Lecture | Introduction to Programming in (Urdu / Hindi)
Link: https://www.youtube.com/watch?v=qoxXJchd7N4
Title: Programming in C Language - While statement
Link: https://www.youtube.com/watch?v=cl0OpNCdF5I
Title: Introduction to html and Basic programming
Link: https://www.youtube.com/watch?v=A4We3NGqxuA
Title: Use of Printf & Scanf functions | Part 7 | C Programming | PadhoChalo
Link: https://www.youtube.com/watch?v=578xS-Ugc2c
Title: C++ course has started | Computer Programming | Aashu |
Link: https://www.youtube.com/watch?v=SjFgTK2HqbE
Title: Mitsubishi Outlander 2008 prox/twist transponder key programming tip
Link: https://www.youtube.com/watch?v=HlSJcBwxKFQ
Title: Computer Programming 1 -Introduction to the course
Link: https://www.youtube.com/watch?v=xdmPbhTT01g
Title: Programming, Data Structures and Algorithms in Python
Link: https://www.youtube.com/watch?v=0fUddu9cdAU
P.S - I wrote two blog posts about how to Scrape YouTube Search with Python (part 1) and Scrape YouTube Search with Python (part 2) that cover it more in-depth with visual representation.
Disclaimer, I work for SerpApi.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
