'Scraping First post from phpbb3 forum by Python

I have alink like that

http://www.arabcomics.net/phpbb3/viewtopic.php?f=98&t=71718

the link has LINKS in first post in phpbb3 forum

How I get LINKS in first post

enter image description here enter image description here

I tried this but not working

import requests
from bs4 import BeautifulSoup as bs



url = 'http://www.arabcomics.net/phpbb3/viewtopic.php?f=98&t=71718'



response= requests.get(url)

soup = bs(response.text, 'html5lib')

itemstr= soup.findAll('div',{'class':'postbody'})
for link in itemstr.findAll('a'):
    links = link.get('href')
    print(links)


Solution 1:[1]

Big oof my man, just use regex for this ? No need to use bs, also regex will work even if they remake site.

import re
myurlregex=re.compile(r'''(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'\".,<>?«»“”‘’]))\" class=\"postlink\"''')

url = re.findall(myurlregex,response.text)[0]

Also as a coder regex is one of skills u will need always.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Strings