'Cannot find the table data within the soup, but I know its there

I am trying create a function that scrapes college baseball team roster pages for a project. And I have created a function that crawls the roster page, gets a list of the links I want to scrape. But when I try to scrape the individual links for each player, it works but cannot find the data that is on their page.

This is the link to the page I am crawling from at the start:

https://gvsulakers.com/sports/baseball/roster

These are just functions that I call within the function that I am having a problem with:

def parse_row(rows):
    return [str(x.string)for x in rows.find_all('td')]

def scrape(url):
  page = requests.get(url, headers = headers)
  html = page.text
  soop = BeautifulSoup(html, 'lxml')
  return(soop)

def find_data(url):
  page = requests.get(url, headers = headers)
  html = page.text
  soop = BeautifulSoup(html, 'lxml')
  row = soop.find_all('tr')
  lopr = [parse_row(rows) for rows in row]
  return(lopr)

Here is what I am having an issue with. when I assign type1_roster with a variable and print it, i only get an empty list. Ideally it should contain data about a player or players from a players roster page.

# Roster page crawler
def type1_roster(team_id):
  url = "https://" + team_id + ".com/sports/baseball/roster"
  soop = scrape(url)
  href_tags = soop.find_all(href = True)
  hrefs = [tag.get('href') for tag in href_tags]
  # get all player links
  player_hrefs = []
  for href in hrefs:
    if 'sports/baseball/roster' in href:
      if 'sports/baseball/roster/coaches' not in href:
        if 'https:' not in href:
          player_hrefs.append(href)
  # get rid of duplicates
  player_links = list(set(player_hrefs))
  # scrape the roster links
  for link in player_links:
    player_ = url + link[24:]
    return(find_data(player_))

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Cannot find the table data within the soup, but I know its there

Sources

Related Questions