Gated Recurrent Units (GRU) Tutorial

Gated Recurrent Units (GRU)

Learn about the GRU, a computationally efficient variant of the LSTM.

Gated Recurrent Units (GRU)

The GRU (Gated Recurrent Unit) is a newer, simplified version of the LSTM introduced by Kyunghyun Cho et al. in 2014. It solves the vanishing gradient problem like LSTMs but uses fewer parameters, making it faster to train.

Simplified Architecture

A GRU combines the LSTM's forget and input gates into a single Update Gate. It also merges the cell state and hidden state, leaving only two gates total: Reset and Update.

Faster Training

Fewer tensor operations mean GRUs typically train faster and use less memory than LSTMs, often achieving identical performance on datasets with less data.

Level 1 — GRU Implementation in Keras

Using a GRU layer

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import GRU, Dense, Embedding

model = Sequential([
    Embedding(input_dim=10000, output_dim=128),
    # Swap out 'LSTM' for 'GRU'. GRUs are great right out of the box!
    GRU(128, return_sequences=False),
    Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Previous: LSTM Networks