'how to get the base string and page no string in for loop?

currently i am putting the full url in urlist i want the only string after pageno in the urlist and the program should go on rest as it as.

https://bidplus.gem.gov.in/bidlists?bidlists&page_no=**AMCR24yMNFkfoXF3wKPmGMy_wV8TJPAlxm6oWiTHGOI**

urlList = ["https://bidplus.gem.gov.in/bidlists?bidlists&page_no=AMCR24yMNFkfoXF3wKPmGMy_wV8TJPAlxm6oWiTHGOI",
           "https://bidplus.gem.gov.in/bidlists?bidlists&page_no=Hgw0LYpSZdLXow1Wq84uKar1nxXbFhClXQDuAAiPDxU",
           "https://bidplus.gem.gov.in/bidlists?bidlists&page_no=rO5Erb90Q_P1S0fL5O6FEShlv20RBXmkHFusZogvUoo",
           "https://bidplus.gem.gov.in/bidlists?bidlists&page_no=jiE0kS8e-ghmlmjDMPUJm1OBCRotqJ6n7srXZN99LZc",
           "https://bidplus.gem.gov.in/bidlists?bidlists&page_no=MY89EG2RtzpSMlT1wjE61Cv31nAyetQ49kmXfw2AfMo",
           ]
for url in urlList:
    print('Hold on creating URL to fetch data...')
    url = 'https://bidplus.gem.gov.in/bidlists?bidlists&page_no=' + str(page_no)
    print('URL created: ' + url)
    scraped_data = requests.get(url, verify=False)
    soup_data = bs(scraped_data.text, 'lxml')
    extracted_data = sou

p_data.find('div', {'id': 'pagi_content'})


Solution 1:[1]

Use this line after your urlList variable:

urlList = [x.split('=')[-1] for x in urlList]

Solution 2:[2]

you can split the urls on = and get the part you need:

for url in urls:
     print(url.split("=")[-1])

outputs:
AMCR24yMNFkfoXF3wKPmGMy_wV8TJPAlxm6oWiTHGOI
Hgw0LYpSZdLXow1Wq84uKar1nxXbFhClXQDuAAiPDxU
rO5Erb90Q_P1S0fL5O6FEShlv20RBXmkHFusZogvUoo
jiE0kS8e-ghmlmjDMPUJm1OBCRotqJ6n7srXZN99LZc
MY89EG2RtzpSMlT1wjE61Cv31nAyetQ49kmXfw2AfMo

if you want the page number in its own list this is how:

pagenumbers = [i.split("=")[-1] for i in urls]
>>> pagenumbers
['AMCR24yMNFkfoXF3wKPmGMy_wV8TJPAlxm6oWiTHGOI', 'Hgw0LYpSZdLXow1Wq84uKar1nxXbFhClXQDuAAiPDxU', 'rO5Erb90Q_P1S0fL5O6FEShlv20RBXmkHFusZogvUoo', 'jiE0kS8e-ghmlmjDMPUJm1OBCRotqJ6n7srXZN99LZc', 'MY89EG2RtzpSMlT1wjE61Cv31nAyetQ49kmXfw2AfMo']

there is no need to split the urls. In your for loop you can just use the url directy since you are iterating over the full url.

for url in urlList:
        print('Hold on fetching data...')
        scraped_data = requests.get(url, verify=False)
        soup_data = bs(scraped_data.text, 'lxml')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 gajendragarg
Solution 2