'Slicing within custom likelihood function in pymc3

I want to do something along the lines of this blog on matrix factorisation and this blog on Rasch models. However, I have too many observations to fit them all at once. Hence I need my custom likelihood function to work with partial observations.

The code below works, but uses the global users within the logp function. This feels ugly. What is the pattern you should use when updating your distributions iteratively?

import theano.sparse as ST 
from scipy.sparse import csr_matrix 
import numpy as np 
import pymc3 as pm
import arviz as az

import theano as thno 
import theano.tensor as tt 

# Create data
n_users = 100
n_items = 10
n_outcomes = 300

users = np.random.randint(0, n_users, size=(n_outcomes))
items = np.random.randint(0, n_items, size=(n_outcomes))
outcomes = np.random.choice([0, 1], p=[0.2, 0.8], size=(n_outcomes))
users[:10], items[:10], outcomes[:10]

with pm.Model() as model: 

    ## Independent priors 
    users_dist = pm.Normal('users', mu=0, sigma=3, shape=(n_users)) 
    items_dist = pm.Normal('items', mu=0, sigma=3, shape=(n_items)) 

    ## Log-Likelihood 
    def logp(obs): 
        """ the log likelihood of the observations """
        users_observed = users_dist[users]
        items_observed = items_dist[items]
        outcomes_observed = obs
        
        items_normalised = items_observed - items_dist.mean(0)
        predicted_prob = tt.nnet.sigmoid(users_observed - items_observed) 
        
        positive = outcomes_observed * tt.log(predicted_prob) 
        negative = (1 - outcomes_observed) * tt.log(1 - predicted_prob)
        return positive + negative 

    ll = pm.DensityDist('ll', logp, observed=outcomes) 
    trace = pm.sample(1000, cores=-1) 

az.plot_trace(trace);

python pymc3

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Slicing within custom likelihood function in pymc3

Sources

Related Questions