'Scraping a dynamic JS website with python to excel
I am hoping someone can help me out. I have been trying for weeks and cannot figure out how to scrape archived data from www.pregame.com/game-center
For instance, I would like to scrape dates from the entire season of NBA. An example would be date 2/8/2022 - here is the url: https://pregame.com/game-center/?d=1644300000000&t=2&l=3&a=1&s=StartTimeDate&m=false&b=undefined&o=Current&c=All&k=
Please if anyone could give some advice/guidance. Thank you!
Solution 1:[1]
I don't see any NBA games coming up for Feb 8.
Anyway, you can get the data through the api and enter the date into the url (or as a payload parameter). There's some data merging and cleanup you'll have to do, but just for a quick example:
import requests
import pandas as pd
url = 'https://pregame.com/api/gamecenter/init?dt=1-30-2022'
headers = {'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/98.0.4758.102 Safari/537.36'}
jsonData = requests.get(url).json()
tables = ['Consensus','Events','Odds','Scores']
for idx, k in enumerate(tables):
table = jsonData['GameCenterData'][k]
temp = pd.json_normalize(table)
if idx == 0:
results = temp
else:
if 'EventId' not in temp.columns:
temp = temp.rename(columns={'Id':'EventId'})
results = results.merge(temp, how='outer', on=['EventId'])
resultsNBA = results[results['LeagueName'] == 'NBA']
resultsNBA.to_csv('nba_2022_01_30.csv', index=False)
Output:
print(resultsNBA)
AllCash AllCashRanking ... AwayStatus HomeStatus
522 213380.23 0.0 ... Final
523 213380.23 0.0 ... Final
524 213380.23 0.0 ... Final
525 213380.23 0.0 ... Final
526 213380.23 0.0 ... Final
... ... ... ... ...
9943 88802.61 0.0 ... Final
9944 88802.61 0.0 ... Final
9945 88802.61 0.0 ... Final
9946 88802.61 0.0 ... Final
9947 88802.61 0.0 ... Final
[2898 rows x 108 columns]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | chitown88 |