'How to count occurances and calculate a rating with the csv module?

You have a CSV file of individual song ratings and you'd like to know the average rating for a particular song. The file will contain a single 1-5 rating for a song per line.

Write a function named average_rating that takes two strings as parameters where the first string represents the name of a CSV file containing song ratings in the format: "YouTubeID, artist, title, rating" and the second parameter is the YouTubeID of a song. The YouTubeID, artist, and title are all strings while the rating is an integer in the range 1-5. This function should return the average rating for the song with the inputted YouTubeID.

Note that each line of the CSV file is individual rating from a user and that each song may be rated multiple times. As you read through the file you'll need to track a sum of all the ratings as well as how many times the song has been rated to compute the average rating. (My code below)

import csv
def average_rating(csvfile, ID):
    with open(csvfile) as f:
        file = csv.reader(f)
        total = 0
        total1 = 0
        total2 = 0
        for rows in file:
            for items in ID:
                if rows[0] == items[0]:
                    total = total + int(rows[3])
                    for ratings in total:
                        total1 = total1 + int(ratings)
                        total2 = total2 + 1
    return total1 / total2

I am getting error on input ['ratings.csv', 'RH5Ta6iHhCQ']: division by zero. How would I go on to resolve the problem?



Solution 1:[1]

You can do this by using pandas DataFrame.

import pandas as pd
df = pd.read_csv('filename.csv')
total_sum = df[df['YouTubeID'] == 'RH5Ta6iHhCQ'].rating.sum()
n_rating = len(df[df['YouTubeID'] == 'RH5Ta6iHhCQ'].rating)
average = total_sum/n_rating

Solution 2:[2]

There are a few confusing things, I think renaming variables and refactoring would be a smart decision. It might even make things more obvious if one function was tasked with getting all the rows for a specific youtube id and another function for calculating the average.

def average_rating(csvfile, id):
    '''
    Calculate the average rating of a youtube video

    params: - csvfile: the location of the source rating file
            - id: the id of the video we want the average rating of
    '''
    total_ratings = 0
    count = 0
    with open(csvfile) as f:
        file = csv.reader(f)
        for rating in file:
            if rating[0] == id:
                count += 1
                total_ratings += rating[3]
    if count == 0:
        return 0
    return total_ratings / count

Solution 3:[3]

import csv 

def average_rating(csvfile, ID) :

    with open(csvfile) as f:

        file = csv.reader(f)

        cont = 0
        total = 0
        for rows in file:
                if rows[0] == ID:
                    cont = cont + 1
                    total = total + int(rows[3])
    return total/cont              

this works guyx

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 shaik moeed
Solution 2
Solution 3