'Assign many values to one key value - Python For Loop

I am practicing with a dataset with customers. Each customer has a first name, last name, city, age, gender and invoice number.

I want to create a dictionary with the customers first and last name as the key value and append the rest of the information to the key value. There can be many invoices per customer, so that customer should only be counted once and have many invoice numbers.

City    FirstName   LastName    Gender  Age InvoiceNum
NYC Jane    Doe Female  35  1023
NYC Jane    Doe Female  35  6523
Jersey City John    Smith   Male    54  6985
Houston Kay Johnson Female  45  2357

To do so, I want to create a for loop.

class Customers:
   city = ""
   age = 0
   invoices = []

f = open("customers".csv)
import csv
reader = csv.reader (f)
next(reader)

customers = {}
for row in reader:

This is where I am stuck. For every row in reader, I want to check if the customer already exists. If it exists, I want to add the repeating invoice numbers. If it does not exist, this will be a new customer where I will have to append the other values (city, gender, age, single invoice number).

Desired Output:

There are 3 customers. 2 are female, 1 is male. their average age is xxxx.

The count of customers does not repeat Jane Doe. the count of female does not repeat for Jane Doe. The average age will not sum Jane Doe's age twice.

Solution 1:^[1]

I came up with this:

from collections import defaultdict
from dataclasses import dataclass, field
from typing import List

@dataclass
class Customer:
    first_name: str = ''
    last_name: str = ''
    city: str = ''
    age: int = 0
    invoices: List = field(init=False, default_factory=list)
    
    def process_entry(self, **row):
        self.first_name = row['FirstName']
        self.last_name = row['LastName']
        self.city = row['City']
        self.age = row['Age']
        self.invoices.append(row['InvoiceNum'])

fake_reader = [
    {
        'FirstName': 'John',
        'LastName': 'Doe',
        'City': 'New York',
        'Age': 30,
        'InvoiceNum': 1
    },
    {
        'FirstName': 'John',
        'LastName': 'Doe',
        'City': 'New York',
        'Age': 30,
        'InvoiceNum': 2
    },
    {
        'FirstName': 'Clark',
        'LastName': 'Kent',
        'City': 'Metropolis',
        'Age': 35,
        'InvoiceNum': 3
    }
]

customers = defaultdict(Customer)
for row in fake_reader:
    customers[(row['FirstName'], row['LastName'])].process_entry(**row)

print(customers)

Output:

defaultdict(<class '__main__.Customer'>, {('John', 'Doe'): Customer(first_name='John', last_name='Doe', city='New York', age=30, invoices=[1, 2]), ('Clark', 'Kent'): Customer(first_name='Clark', last_name='Kent', city='Metropolis', age=35, invoices=[3])})

The "trick" here is to define the Customer class with default values, this way the real values can get filled using the process_entry method.

Solution 2:^[2]

I think you're looking for something of the sort:

if name not in customers:
    customers[name] = [invoice]
else: 
    customers[name].append(invoice)

This creates a key-value pair, with the value as an array which can then be appended to every time the for loop finds a new invoice for that name.

Edit: update to match your csv file

customers = {}
# [1:] to ignore file header
for row in reader[1:]:
   City, FirstName, LastName, Gender, Age, InvoiceNum = row.split().strip()
   newEntry = {'InvoiceNum': int(InvoiceNum), 'City': City, 'Gender': Gender, 'Age': int(Age)}
   
  if (FirstName, LastName) not in customers:
    customers[(FirstName, LastName)] = [newEntry]
  else: 
    customers[(FirstName, LastName)].append(newEntry)

Immutable types can be dictionary keys, so I choose a tuple of the first and last name.

Edit: I'm hoping my answer takes you in the right direction, I left the 'csv' details to you, as your row may not correspond to what I did there.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	DevLounge
Solution 2	marc_s

'Assign many values to one key value - Python For Loop

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]