'Python CSV to dictionary with multiple row entries per 1 item

I am using python to turn a CSV file into a dictionary, where the CSV file has multiple values for the same column.

The following works to use the CSV headers (first line) as the named key to turn a simple CSV without multiple values into a dictionary:

def main():
    content = csvArray(".../Csv.csv")
    print(content)


def csvArray(path): 
    df = pd.read_csv(path)
    records = df.to_dict(orient='records')
    return records

However, I now have an issue. There is an Image column in the CSV, and in many cases, there are multiple entries per column for 1 item, formatted like:

SKU ImageData
12345 1st Image Data
2nd Image Data
3rd Image Data
12346 1st Image Data
2nd Image Data

etc...

There can be anywhere up to 8 images for 1 SKU.

My csvArray function does not work with the CSV formatted as such, and changing the format of the CSV is not possible from the export.

How could I concatenate all the image data into the first row? Or any alternative that could work turning the CSV into a dictionary?



Solution 1:[1]

Data from your comment to your question:

s = '''Internal Reference;Name;Extra Product Media/Image TGTLI20018;20V Grass Trimmer - Body only;1st Image base64 data ;;2nd Image base64 data ;;3rd Image base64 data ;;4th Image base64 data ;;5th Image base64 data TGTLI20019;25V Grass Trimmer;1st Image base64 data ;;2nd Image base64 data'''

If you can determine a pattern that delineates records and will not occur in the base64 image data like ...

pattern = ' TGTLI'
  • find all the indices of this pattern in the data - (49, 208) in this case

  • iterate over the indices in (overlapping pairs) and use them to slice the data

    record = s[49:208]
    
  • split the record with semicolon

>>> s[49:208].split(';')
[' TGTLI20018', '20V Grass Trimmer - Body only', '1st Image base64 data ', '', '2nd Image base64 data ', '', '3rd Image base64 data ', '', '4th Image base64 data ', '', '5th Image base64 data']
  • extract the fields and make the dictionary.

How to find all occurrences of a substring?
Iterate a list as pair (current, next) in Python

many more of those examples/Q&A's searching here on SO.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1