'BeautifulSoup works correct until a part of the page loaded
How are you? I am again in need of help. I emphasize that I'm learning, a lot was copied from other scripts to get the result I need. The truth is that the script works to a certain extent. What am I trying to collect? The ranking of a mmorpg site, which has information on the top 1000 players on the server. The script I'm using is this:
import requests
from bs4 import BeautifulSoup
arquivo = open('arq01.csv','w')
arquivo.write("")
arquivo.close()
mundo = 192
grupo_mundo = 15
arquivo = open('arq01.csv','a', encoding="utf-8")
def filter(index, datas):
nickname = None
power = None
guild = None
names = []
for data in datas:
if ',' in data:
power = data
try:
v = float(data)
except Exception:
if '\n' in data and len(data) > 3:
data = data.rstrip('\n')
data = data.lstrip('\n')
data = data.lstrip('\n')
names.append(data)
nickname = data
elif ',' not in data and len(data) > 3:
names.append(data)
for name in names:
if name != nickname:
guild = name
#arquivo.write(f"{index} : {nickname} : {power} : {guild}\n")
print(f"{index}: {nickname} - {power} - {guild}")
data = {
"index": index,
"nickname":nickname,
"power":power,
"guild":guild,
}
index = 0
for page in range(1, 11):
url = f"https://forum.mir4global.com/rank?ranktype=1&worldgroupid={grupo_mundo}&worldid={mundo}&classtype=&searchname=&loaded=1&liststyle=ol&page={page}"
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
datas = []
for span in soup.find_all('span'):
value = span.get_text()
datas.append(value)
if len(datas) == 8:
index += 1
filter(index, datas)
datas = []
arquivo.close()
But something strange happens when in the ranking of the site, some player is without a guild, on the site it is marked with a "-". When there is a player without a guild, either the one above or the player below him, the information is bugged as you can see in the image below and in the result of the script that was generated:
imagem with problem: https://i.stack.imgur.com/rs6ol.png
Scraping Result:
188, krelbynha, 125,578, None
189, XUXA BR, 125,567, None
In some results, the script reverses clans for no apparent reason. I've tried in several ways but I can't succeed.
Another thing I need and don't know how to do:
I need that in addition to this problem described above to be solved, I would also like to know how I could filter the entire result only with players who are in a certain guild. Which parameter do I use to process this data and filter the result only with the list of players from guild "x".
I apologize for the mistakes in English and I understand that there are other threads with similar problems, but as I said and I repeat, I'm learning, bear with me, please. :) Many thanks to anyone who can help me.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
