Animated RNN, LSTM and GRU

面试 LSTM RNN GRU

深度学习

发布日期: 2021-03-22

RNN是

Fig. 0: Legend for animations

Vanilla RNN

Fig. 1: Animated vanilla RNN cell

$t$——time step
$X$——input
$h$——hidden state
length of $X$ —size/dimension of input
length of $h$ — no. of hidden units. Note that different libraries call them differently, but they mean the same:
- Keras — state_size ,units
- PyTorch — hidden_size
- TensorFlow — num_units

下面两张图是LSTM中一个cell的内部结构。

Animated LSTM cell

The repeating module in an LSTM contains four interacting layers.

上图中各个符号的含义：

符号含义

Neural Network Layer表示里面具有需要学习的参数。
Pointwise Operation表示单纯的向量操作，比如vector addition。需要仔细观察的是，在上图中，有一个tanh函数属于Neural Network Layer，而另一个tanh函数属于Pointwise Operation，区别可以在Input Gate Layer和Output Gate Layer部分的数学公式知晓。

LSTM一共有3个gate来控制cell state。