'How do use the soup.find, soup.find_all

Here is my code and the output

import requests from bs4 import BeautifulSoup

res = requests.get("https://www.jobberman.com/jobs")
soup = BeautifulSoup(res.text, "html.parser")
job = soup.find("div", class_ = "relative inline-flex flex-col w-full text-sm font-normal pt-2")
company_name = job.find('a[href*="jobs"]')
print(company_name)

output is none

None

But when i use the select method, i got the desired result but cant use .text on it

import requests
from bs4 import BeautifulSoup

res = requests.get("https://www.jobberman.com/jobs")
soup = BeautifulSoup(res.text, "html.parser")
job = soup.find("div", class_ = "relative inline-flex flex-col w-full text-sm font-normal pt-2")
company_name = job.select('a[href*="jobs"]').text
print(company_name)

output

AttributeError: ResultSet object has no attribute 'text'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?


Solution 1:[1]

Change your selection strategy - Cause main issue here is, that not all company names are linked:

job.find('div',{'class':'search-result__job-meta'}).text.strip()

or

job.select_one('.search-result__job-meta').text.strip()

Example

Also store your information in a structured way for post processing:

import requests
from bs4 import BeautifulSoup

res = requests.get("https://www.jobberman.com/jobs")
soup = BeautifulSoup(res.text, "html.parser")
data = []
for job in soup.select('div:has(>.search-result__body)'):
    data.append({
        'job':job.h3.text,
        'company':job.select_one('.search-result__job-meta').text.strip()
    })
data
Output
[{'job': 'Restaurant Manager', 'company': 'Balkaan Employments service'},
 {'job': 'Executive Assistant', 'company': 'Nolla Fresh & Frozen ltd'},
 {'job': 'Portfolio Manager/Instructor 1', 'company': 'Fun Science World'},
 {'job': 'Microbiologist', 'company': "NEIMETH INT'L PHARMACEUTICALS PLC"},
 {'job': 'Data Entry Officer', 'company': 'Nkoyo Pharmaceuticals Ltd.'},
 {'job': 'Chemical Analyst', 'company': "NEIMETH INT'L PHARMACEUTICALS PLC"},
 {'job': 'Senior Front-End Engineer', 'company': 'Salvo Agency'},...]

Solution 2:[2]

The problems with your search strategy has been covered by comments and answers posted earlier. I am offering a solution for your problem which involves the use of regex library, along with the find_all() function call:

    import requests
    from bs4 import BeautifulSoup
    import re
    
    res = requests.get("https://www.jobberman.com/jobs")
    soup = BeautifulSoup(res.text, "html.parser")
    company_name = soup.find_all("a", href=re.compile("/jobs\?"), rel="nofollow")
    for i in range(len(company_name)):
        print(company_name[i].text)

Output:

GRATIAS DEI NIGERIA LIMITED

Balkaan Employments service

Fun Science World

NEIMETH INT'L PHARMACEUTICALS PLC

Nkoyo Pharmaceuticals Ltd.

...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Cheo Kee Jin