'Slicing within custom likelihood function in pymc3
I want to do something along the lines of this blog on matrix factorisation and this blog on Rasch models. However, I have too many observations to fit them all at once. Hence I need my custom likelihood function to work with partial observations.
The code below works, but uses the global users within the logp function. This feels ugly. What is the pattern you should use when updating your distributions iteratively?
import theano.sparse as ST
from scipy.sparse import csr_matrix
import numpy as np
import pymc3 as pm
import arviz as az
import theano as thno
import theano.tensor as tt
# Create data
n_users = 100
n_items = 10
n_outcomes = 300
users = np.random.randint(0, n_users, size=(n_outcomes))
items = np.random.randint(0, n_items, size=(n_outcomes))
outcomes = np.random.choice([0, 1], p=[0.2, 0.8], size=(n_outcomes))
users[:10], items[:10], outcomes[:10]
with pm.Model() as model:
## Independent priors
users_dist = pm.Normal('users', mu=0, sigma=3, shape=(n_users))
items_dist = pm.Normal('items', mu=0, sigma=3, shape=(n_items))
## Log-Likelihood
def logp(obs):
""" the log likelihood of the observations """
users_observed = users_dist[users]
items_observed = items_dist[items]
outcomes_observed = obs
items_normalised = items_observed - items_dist.mean(0)
predicted_prob = tt.nnet.sigmoid(users_observed - items_observed)
positive = outcomes_observed * tt.log(predicted_prob)
negative = (1 - outcomes_observed) * tt.log(1 - predicted_prob)
return positive + negative
ll = pm.DensityDist('ll', logp, observed=outcomes)
trace = pm.sample(1000, cores=-1)
az.plot_trace(trace);
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
