'Is there any way instead of status_code to determine the request is true or false?

I'm using Python3 with BeautifulSoup. I want to scrape data for a few employees from a site, depending on their ID number.

My code:

for UID in range(201810000,201810020):
    ID = UID
    print(ID)
    #scrapped Data 
    ZeroDay = s.post("https://site/Add_StudantRow.php",data={"SID":ID})
    ZeroDay_content = bs(ZeroDay.content,"html.parser", from_encoding='windows-1256')
    std_ID    = ZeroDay_content.find("input", {"name":"SID[]"})["value"]
    std_name  = ZeroDay_content.find("input", {"name":"Name[]"})["value"]
    std_major_= ZeroDay_content.select_one("option[selected]", {"name":"Qualifications[]"})["value"]
    std_major = ZeroDay_content.find("input", {"name":"Specialization[]"})["value"]
    std_social= ZeroDay_content.select_one("select[name='MILITARY_STATUS[]'] option[selected]")["value"]
    std_ID_num= ZeroDay_content.find("input", {"name":"ID_Number[]"})["value"]
    std_gender= ZeroDay_content.select_one("select[name='Gender[]'] option[selected]")["value"]

print(std_ID,std_name,std_gender,std_major,std_major_,std_ID_num,std_social)

After I ran my code, this error appeared:

    std_ID    = ZeroDay_content.find("input", {"name":"SID[]"})["value"]
TypeError: 'NoneType' object is not subscriptable

I assigned a range for their ID's from 201810000 to 201810020 but not all the IDs are valid. I mean maybe 201810015 not valid and 201810018 valid.

Note: when I put a valid ID in UID the error did not appear, possibly because when the ID returns a null value the error appears, but how can I do a range of IDs in this case?



Solution 1:[1]

As not all of your UID values return a valid page, you would just need to first test for the presence of a required tag. As you are looking for form elements, I assume there will be an enclosing <form> tag you could test for first.

For example:

for UID in range(201810000, 201810020):
    ID = UID
    print(ID)
    
    ZeroDay = s.post("https://site/Add_StudantRow.php", data={"SID":ID})
    ZeroDay_content = bs(ZeroDay.content, "html.parser", from_encoding='windows-1256')
    
    if ZeroDay_content.find("form", <xxxxxxx>):
        std_ID    = ZeroDay_content.find("input", {"name":"SID[]"})["value"]
        std_name  = ZeroDay_content.find("input", {"name":"Name[]"})["value"]
        std_major_= ZeroDay_content.select_one("option[selected]", {"name":"Qualifications[]"})["value"]
        std_major = ZeroDay_content.find("input", {"name":"Specialization[]"})["value"]
        std_social= ZeroDay_content.select_one("select[name='MILITARY_STATUS[]'] option[selected]")["value"]
        std_ID_num= ZeroDay_content.find("input", {"name":"ID_Number[]"})["value"]
        std_gender= ZeroDay_content.select_one("select[name='Gender[]'] option[selected]")["value"]
        
        print(std_ID, std_name, std_gender, std_major, std_major_, std_ID_num,s td_social)

Where <xxxxx> would be suitable attributes to search for.

The error you are getting is because your first .find() call is returning None to indicate that the item is not present. You then use ["value"] on None which gives the error without first testing if you have found the required item.

Solution 2:[2]

I resolve this by add an IF statement and use content-length as a thing to determine that the request was made or not, because i have noticed that the content-length is less than 170 if the request is return nothing and more 170 if return any thing .

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Martin Evans
Solution 2 xxxzman