'calculating average artist entropy given user prediction and tracks in recommender systems

I have to calculate average artist entropy of users. I have solved this task on a test case but I am not able to generalize it to more task cases. Shannon Entropy formula was used calculation the entropy of users.

def get_average_entropy_score(predictions: np.ndarray, item_df: pd.DataFrame, topK=10) -> float:
"""
predictions - np.ndarray - predictions of the recommendation algorithm for each user.
item_df - pd.DataFrame - information about each song with columns 'artist' and 'track'.

returns - float - average entropy score of the predictions.
"""

score = None

# TODO: YOUR IMPLEMENTATION.

l = []
for i in item_df['artist']:
    l.append(i)

prob = 0
prob2 = 0
prob3 = 0
prob4 = 0

for i in range(len(predictions)):
    for j, v in enumerate(predictions[i]): 
            if l[v] == 'A1':
                p = 1/len(predictions[i])
                prob += p
            if l[v] == 'A2':
                p = 1/len(predictions[i])
                prob2 += p
            if l[v] == 'A3':
                p = 1/len(predictions[i])
                prob3 += p
            if l[v] == 'A4':
                p = 1/len(predictions[i])
                prob4 += p
            if v != -1:
                continue

entro1 = (prob*np.log2(prob))
entro2 = -(prob2*np.log2(prob2) + prob3*np.log2(prob3) + prob4*np.log2(prob4))

add = entro1 + entro2
entropy_over_users = add/4   # number of items/user
score = entropy_over_users
print(entropy_over_users)
            
return score

Now imagine I have a dataframe of artist - track like the following:

item_df = pd.DataFrame({'artist': ['A1', 'A1', 'A1', 'A1', 'A2', 'A3', 'A4']})

And I have a prediction of recommender system predicting items in position 0 1 2 or 3 like the following:

predictions = np.array([[0, 1, 2, 3], [6, 5, 4, 3], [-1, -1, -1, -1]])

From predictions e.g. the user 1 has been recommended item 0 first, item 1 second, item 2 third and item 3 fourth. A prediction of -1 means I should ignore this value because this item has not been seen by the user and should not be included in to calculation at all.

Now the question is I can't get it to work for general case where for example I don't know the A1, A2 and so on. or better Imagine you don't know the track names. Also see that item 0 in the prediction means that it is the first track in item_df, item 1 means the second and so on. Please help me. I don't know how to progress further! Please ask if something is unclear! Thanks!

Additional remark: solving the test case on paper gave me 0.5 if I normalize it.

python recommender-systems

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'calculating average artist entropy given user prediction and tracks in recommender systems

Sources

Related Questions