'Failed to scrape tabular content from a webpage using requests module

I'm trying to scrape tabular content from a webpage using requests module. The content of that page is heavily dynamic However, it can be accessed via an api according to dev tools. I'm trying to mimic the same issuing a post requests with appropriate parameters but I always get status 403.

import requests
from pprint import pprint

start_url = 'https://opensea.io/rankings'
link = 'https://api.opensea.io/graphql/'
payload = {"id":"rankingsQuery","query":"query rankingsQuery(\n  $chain: [ChainScalar!]\n  $count: Int!\n  $cursor: String\n  $sortBy: CollectionSort\n  $parents: [CollectionSlug!]\n  $createdAfter: DateTime\n) {\n  ...rankings_collections\n}\n\nfragment rankings_collections on Query {\n  collections(after: $cursor, chains: $chain, first: $count, sortBy: $sortBy, parents: $parents, createdAfter: $createdAfter, sortAscending: false, includeHidden: true, excludeZeroVolume: true) {\n    edges {\n      node {\n        createdDate\n        name\n        slug\n        logo\n        stats {\n          floorPrice\n          marketCap\n          numOwners\n          totalSupply\n          sevenDayChange\n          sevenDayVolume\n          oneDayChange\n          oneDayVolume\n          thirtyDayChange\n          thirtyDayVolume\n          totalVolume\n          id\n        }\n        id\n        __typename\n      }\n      cursor\n    }\n    pageInfo {\n      endCursor\n      hasNextPage\n    }\n  }\n}\n","variables":{"chain":None,"count":100,"cursor":"YXJyYXljb25uZWN0aW9uOjk5","sortBy":"SEVEN_DAY_VOLUME","parents":None,"createdAfter":None}}

with requests.Session() as s:
    s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36'
    s.headers['x-api-key'] = '2f6f419a083c46de9d83ce3dbe7db601'
    s.headers['x-build-id'] = 'cplNDIqD8Uy8MvANX90r9'
    s.headers['referer'] = 'https://opensea.io/'
    res = s.post(link,json=payload)
    pprint(res.status_code)
    print(res.json())

How can I scrape tabular content from that webpage using requests module?



Solution 1:[1]

I don't think that graphql query is the one you want. There is a GET query there that returns the data.

try instead

res = s.get('https://api.opensea.io/tokens/?limit=100')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Rusticus