'Difference between LSTMCell and MinimalRNNCell

When I look up the implementation of LSTMCell and MinimalRNNCell in the following link, I found that they are different, anyone knows the reason, and if i want to create my own lstm cell, I should follow the instruction of LSTMCell or MinimalRNNCell ? And if the answer is LSTMCell, then what is the functionality of MinimalRNNCell ?

https://github.com/keras-team/keras/blob/20b840fa4d8e62674a9090e34fc9943a4ecd04ec/keras/layers/recurrent.py#L2415

codes for LSTMCell:

def call(self, inputs, states, training=None):
    h_tm1 = states[0]  # previous memory state
    c_tm1 = states[1]  # previous carry state

    dp_mask = self.get_dropout_mask_for_cell(inputs, training, count=4)
    rec_dp_mask = self.get_recurrent_dropout_mask_for_cell(h_tm1, training, count=4)

    if self.implementation == 1:
        if 0 < self.dropout < 1.:
            inputs_i = inputs * dp_mask[0]
            inputs_f = inputs * dp_mask[1]
            inputs_c = inputs * dp_mask[2]
            inputs_o = inputs * dp_mask[3]
        else:
            inputs_i = inputs
            inputs_f = inputs
            inputs_c = inputs
            inputs_o = inputs
        k_i, k_f, k_c, k_o = tf.split(self.kernel, num_or_size_splits=4, axis=1)
        x_i = backend.dot(inputs_i, k_i)
        x_f = backend.dot(inputs_f, k_f)
        x_c = backend.dot(inputs_c, k_c)
        x_o = backend.dot(inputs_o, k_o)
        if self.use_bias:
            b_i, b_f, b_c, b_o = tf.split(self.bias, num_or_size_splits=4, axis=0)
            x_i = backend.bias_add(x_i, b_i)
            x_f = backend.bias_add(x_f, b_f)
            x_c = backend.bias_add(x_c, b_c)
            x_o = backend.bias_add(x_o, b_o)

        if 0 < self.recurrent_dropout < 1.:
            h_tm1_i = h_tm1 * rec_dp_mask[0]
            h_tm1_f = h_tm1 * rec_dp_mask[1]
            h_tm1_c = h_tm1 * rec_dp_mask[2]
            h_tm1_o = h_tm1 * rec_dp_mask[3]
        else:
            h_tm1_i = h_tm1
            h_tm1_f = h_tm1
            h_tm1_c = h_tm1
            h_tm1_o = h_tm1
        x = (x_i, x_f, x_c, x_o)
        h_tm1 = (h_tm1_i, h_tm1_f, h_tm1_c, h_tm1_o)
        c, o = self._compute_carry_and_output(x, h_tm1, c_tm1)
    else:
        if 0. < self.dropout < 1.:
            inputs = inputs * dp_mask[0]
        z = backend.dot(inputs, self.kernel)
        z += backend.dot(h_tm1, self.recurrent_kernel)
        if self.use_bias:
            z = backend.bias_add(z, self.bias)

        z = tf.split(z, num_or_size_splits=4, axis=1)
        c, o = self._compute_carry_and_output_fused(z, c_tm1)

    h = o * self.activation(c)
    return h, [h, c]

codes for MinimalRNNCell:

def call(self, inputs, states):
    prev_output = states[0]
    h = backend.dot(inputs, self.kernel)
    output = h + backend.dot(prev_output, self.recurrent_kernel)
    return output, [output]


Solution 1:[1]

MinimalRNNCell is implemented for RNN cell, which has no long term dependency (cell state) as what is known as RNN; while LSTMCell is implemented for the basic unit LSTM cell, and it has the long term (cell state) and short term (hidden state) dependencies.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Mingming Qiu