'iterate scraping throught URLS

I have this code that im tryng to do but get error on invalid schema

#for index, row in df.iterrows():
  #  print(index,row["Data"])
for offset in (df.apply(lambda row: row["Data"]  , axis = 1)):

    response = requests.get(df["Data"])
    print('url:', response.url)
    

this is my dataframe that are a group of links per page (10 per page) and two index so they are 20 links. Data 0 [http://www.mercadopublico.cl/Procurement/Modu... 1 [http://www.mercadopublico.cl/Procurement/Modu...

I want to make this code run for every 10 links and scrape them and get the data, then go to next , but the data scraped will be on one set of information in a table.

but i cant make the response get the url inside of the data frame

i get this message

InvalidSchema: No connection adapters were found for '0    [http://www.mercadopublico.cl/Procurement/Modu...\n1    [http://www.mercadopublico.cl/Procurement/Modu...\nName: Data, dtype: object'

do you have a advice to this? best regards

I think that also would help me put both index in one fusing them, but not sure how to do it, searched a lot but coultdn't find how, some reference to np.array that I tried but didnt work.



Solution 1:[1]

just to answer because i solved it , never store url as dataframe if you are scraping later, instead of make a dataframe resultsurl[] store it as list resultsurl=list()

and then iterate on list as for i in list() this case is calet resulturl..

thanks

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 kcomarks