'Creating dictionary from two lists of webscraped data

so I have this webdata that I scraped from a product website. I scraped it using BeautifulSoup, and scraped multiple pages from the product website. I get two lists from the scraper, one is of the specification and the other is the data for specification. Here is an example:

Blade length : 2.97/1.97" 
Blade Thickness : 0.090/2.54" 
Open Length :   7.05/6.05" 
Closed Length:  4.08" | 9.78cm
Handle Thickness:   0.40" | 10.16mm
Weight: 2.28oz | 64.64g

I want to get the left hand side to be a Key for dictionary and the right hand side to be the value. The ultimate goal is to put it in a csv where I can have the left hand side to be the column headers for the data in the right hand side. Since I am scraping multiple pages, the left hand side repeats itself and the there are multiple values of the right hand side.

So the desired output should be something like this:

 Blade Length. | | Blade Thickness|| Open Length |--etc etc
|------------- | |----------------||-------------| 
| 2.97/1.97"   | |  4.34/12.54    ||  1.23/5.65  |  
| 4.24/2.23"   | |  2.34/5.63     ||  5.43/2.90  |  
| 3.54/2.65    | |  2.57/6.54     ||  6.90/4.20  |  
| 7.65/5/43    | |  4.65/3.56     ||  3.32/4.54  |  

so if there is a better way to do this than dictionaries then please let me know!

The HTML is something like this:

<table class="specifications-table">
     <tbody>
               <tr>
                    <th class="col label">Blade Length:</th>
                    <td class="col value">2.97/1.97"</td>
               </tr>
               <tr>
                    <th class="col label">Blade Thickness:</th>
                    <td class="col value">0.090"</td>
               </tr>
               <tr>
                    <th class="col label">Open Length: </th>
                    <td class="col value">7.05/6.05"</td>
               </tr>
               <tr>
                    <th class="col label">Closed Length: </th>
                    <td class="col value">4.08"</td>
               </tr>
               <tr>
                    <th class="col label">Handle Thickness:</th>
                    <td class="col value">0.40" </td>
               </tr>
               <tr>
                    <th class="col label">Weight:</th>
                    <td class="col value">2.28oz</td>
               </tr>
      </tbody>
</table>

Here is my attempt to get this data:

Specs = []
Specs_Datas = defaultdict(list)
Specs2 = []
for links in product_links:
    HTML2 = requests.get(links, HEADER)
    Booti2 = soup(HTML2.content,"html.parser") 
    table_feature = Booti2.select_one('#product-attribute-specs-table')
    #find all rows
    try:
        for S in Booti2.find_all('th', attrs ={'class': 'col label'}):
            Specs.append(S.text.replace('\n', '').strip())
            unique_specs = np.unique(Specs).tolist()
            while unique_specs in Specs:
                for SD in Booti2.find_all('td', attrs ={'class': 'col value'}):
                    Specs2.append(SD.text.replace('\n', '').strip())
                    Specs_Datas[unique_specs] = []
                    Specs_Datas[unique_specs].update(SD.text.replace('\n', '').strip())
            #Specs.append(S.text.replace('\n', '').strip())
            
    except:
        continue

Any help would be appreciated!! Thank you so much!!!



Solution 1:[1]

Assuming you actually have two lists as:

specs = [
    "Blade length",
    "Blade Thickness",
    "Open Length",
    "Closed Length",
    "Handle Thickness",
    "Weight"
]

and

spec_data = [
    "2.97/1.97\"",
    "0.090/2.54\"",
    "7.05/6.05\"",
    "4.08\" | 9.78cm",
    "0.40\" | 10.16mm",
    "2.28oz | 64.64g"
]

The easiest path might be to zip() them together:

specs_reshaped = [
    {key: value for key, value in zip(specs, spec_data)}
]

Then use a DictWriter()

with open("output.csv", "w", newline="", encoding="utf-8") as file_out:
    writer = csv.DictWriter(file_out, fieldnames=specs)
    writer.writeheader()
    writer.writerows(specs_reshaped)

This produces a file like:

Blade length,Blade Thickness,Open Length,Closed Length,Handle Thickness,Weight
"2.97/1.97""","0.090/2.54""","7.05/6.05""","4.08"" | 9.78cm","0.40"" | 10.16mm",2.28oz | 64.64g

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 JonSG