'Why wont BeautifulSoup extract all the HTML from a public twitter page?
I am trying to write some code to extract tweets from a public twitter page (Nike store) using the Python BS4 module. When I print the page HTML into the console, only some of the HTML is printed - when I try to search (ctrl +F) the specific class values for a tag from the console output and it returns with zero results. Why is this happening?
Here a code snippet:
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen
import re
if __name__ == '__main__':
# Read webpage into page_html' and close connection to webpage'
first_page = 'https://twitter.com/nikestore'
url_client = urlopen(first_page)
page_html = url_client.read()
url_client.close()
print(page_html)
Solution 1:[1]
I came across the accepted answer in the following link. Answer also suggests using selenium to circumvent the problem.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
