'BeautifulSoup only outputs data sometimes?

So I'm scraping the link to all the posts on this subreddit (specifically the top posts for the last 24hrs.) But when I run my program it sometimes outputs all the data, and other times outputs nothing. Same exact code. It works about 1/5 of the time.

# URL of subreddit
test = requests.get('https://www.reddit.com/r/TikTokCringe/top/')
# the html of the request
html = test.text
# making a soup of the html
soup = BeautifulSoup(html, 'html.parser')
# the find_all is finding the first 30 a elements that have a href that starts with '/r/TikTokCringe/comments'
for href in soup.find_all('a', {"href": re.compile('/r/TikTokCringe/comments/*')})[:30]:
    # im looping through every element because I eventually want to get just the links
    # for now im just trying to print every element
    print(href)


Solution 1:[1]

You're getting HTTP error 429 - Too many requests. Try to slow down or set User-Agent HTTP header:

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"
}

# URL of subreddit
test = requests.get("https://reddit.com/r/TikTokCringe/top/", headers=headers)

...

Also: consider using their JSON format (add .json at the end of the URL):

data = requests.get(
    "https://reddit.com/r/TikTokCringe/top/.json", headers=headers
).json()

print(data)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Andrej Kesely