'debugging an implementation of Viterbi algorithm for HMM
I'm having a bit of trouble trying to find out what's wrong with my implementation of the Viterbi algorithm. I'm trying to use this for NER (Named Entity Recognition). The emission_log_probs, transition_log_probs and initial_log_probs matrices are the Emission (SxV), Transition(SxS) and Initial state (S) matrixes respectively, containing log probabilities. Here, S is the number of states, V is the number of words in the vocabulary. The words (tokens) are indexed in a word_indexer to simply map them from string->int. :
N = len(sentence_tokens) # number of tokens
S = len(self.tag_indexer) # number of states
scores = np.empty((N, S), 'd') # the log probabilities of the most like path so far
backtrace = np.empty((N, S), 'B') # the states that correspond to the max at each step
# initialize
first_word = self.word_indexer.index_of(sentence_tokens[0].word) # get index of first word
scores[0, :] = self.init_log_probs + self.emission_log_probs[:, first_word]
backtrace[0, :] = 0
# recurrence
for i in range(1, N):
word = self.word_indexer.index_of(sentence_tokens[i].word)
for s in range(0, S):
scores[i, s] = self.emission_log_probs[s, word] + np.max(self.transition_log_probs[:, s] + scores[i-1,:])
backtrace[i, s] = np.argmax(self.transition_log_probs[:, s] + scores[i-1,:])
# build sequence
x = np.empty(N, 'B')
x[-1] = np.argmax(scores[N-1, :])
for i in reversed(range(1, N)):
x[i - 1] = backtrace[i, x[i]]
# x is final sequence of states
It seems that I should be achieving a much higher accuracy on the same dataset (as my instructor suggested), however, I'm a bit stuck on where I'm going wrong. Does anyone see anything immediately concerning?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
