'How to convert div tags to a table?
I want to extract the table from this website https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=37&year=214 Checking the source of that website, I noticed that somehow the table tag is missing. I assume that this table is a summary of multiple div classes. Is there any easy approach to convert this table to excel/csv? I badly have coding skills/experience... Appreciate any help
Solution 1:[1]
There are a few way to do that. One of which (in python) is (pretty self-explanatory, I believe):
import lxml.html as lh
import csv
import requests
url = 'https://www.rankingthebrands.com/The-Brand-Rankings.aspx?rankingID=37&year=214'
req = requests.get(url)
doc = lh.fromstring(req.text)
headers = ['Position', 'Name', 'Brand Value', 'Last']
with open('brands.csv', 'a', newline='') as fp:
#note the 'a' in there - for 'append`
file = csv.writer(fp)
file.writerow(headers)
#with the headers out of the way, the heavier xpath lifting begins:
for row in doc.xpath('//div[@class="top100row"]'):
pos = row.xpath('./div[@class="pos"]//text()')[0]
name = row.xpath('.//div[@class="name"]//text()')[0]
brand_val = row.xpath('.//div[@class="weighted"]//text()')[0]
last = row.xpath('.//div[@class="lastyear"]//text()')[0]
file.writerow([pos,name,brand_val,last])
The resulting file should be at least close to what you're looking for.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Jack Fleeting |
