'Python: Access sessionStorage using requests
I need to have access to sessionStorage (as with javascript) object using python requests module, is there a way to acomplish my goal; I have seen other answers and none of them seem to have adequate responses for the task I want to acomplish
If there's no way, what other alternatives do I have besides of working with selenium (since there's a manner to do so)?
IN SIMPLE TERMS
I want to do this:
var x = sessionStorage; // js code
But in python 3.9 :)
Solution 1:[1]
IIUC: The following code was written to extract the sessionStorage property values from a web page into a Python dict.
import re
import json
from bs4 import BeautifulSoup as bs
import requests
# Setup.
site = 'http://www.some-site.com/page'
exp = '^[\n\s]+sessionStorage.setItem\(.*JSON.stringify\((?P<content>{.*})\)\)'
r = requests.get(site)
if r.status_code == 200:
soup = bs(r.text)
# Extract all <script> tags from the full HTML.
scripts = soup.findAll('script')
# Loop through all <script> tags until sessionStorage is found.
script = [s.string for s in scripts if 'sessionStorage' in s.decode()]
# Use regex (with a named capture group) to extract the JSON data.
m = re.match(exp, script[0])
if m:
content = m['content']
# Convert scraped JSON data to a dict.
data = json.loads(content)
Note: The regex pattern may need to be modified to suit your (a user's) specific use-case.
TL;DR (Background):
I ran across this question in my own search for a more elegant solution to the above code.
In my case, I was writing unit tests for a site and needed to grab the sessionStorage property from a specific web page to test that it contains the expected elements. As the data was in JSON format, this code extracts the JSON data and converts to a Python dict for inspection.
Solution 2:[2]
If you're using ajax to get information from the server, you should include your sessionStorage data first and then attach it to the request like
myStorage = window.sessionStorage;
let token = myStorage.getItem('token')
$.get(url, { token: token }, function(response)() }
Or, if you're not using Ajax and you switch pages by refreshing your page everytime you can use cookies vs session storage which then the server can extract
# Python
import os, sys
sys.stderr.write("cookies:", os.getenv('HTTP_COOKIE'))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Ricky Levi |
