'Python parsing the site gives <html></html>

There is a website that I need to analyze However, when I try to analyze it, I get the response <html></html>

Tried to change the useragent, cookie, doesn't help.

from bs4 import BeautifulSoup
import httpx

response = httpx.get('https://lolz.guru/market/')
soup = BeautifulSoup(response.text, 'lxml')

print(response.text)


Solution 1:[1]

If that site requires a real browser, you could try to direct a real browser to retrieve the page and the data. Selenium is a tool intended to test web applications, but in essence it can run scripts imitating user interaction with web browsers so the applications get checked.

There are nice tutorials out there, also for using Selenium from Python.

It also supports cookies: https://www.selenium.dev/documentation/webdriver/browser/cookies/

from selenium import webdriver

driver = webdriver.Chrome()

driver.get("http://www.example.com")

# Adds the cookie into current browser context
driver.add_cookie({"name": "key", "value": "value"})

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1