diff --git a/doc/DeepSpeech.rst b/doc/DeepSpeech.rst
index c907d950..b4fa7ebd 100644
--- a/doc/DeepSpeech.rst
+++ b/doc/DeepSpeech.rst
@@ -1,9 +1,16 @@
 Introduction
 ============
 
-In this project we will reproduce the results of
+The aim of this project is to create a simple, open, and ubiquitous speech
+recognition engine. Simple, in that the engine should not require server-class
+hardware to execute. Open, in that the code and models are released under the
+Mozilla Public License. Ubiquitous, in that the engine should run on many
+platforms and have bindings to many different languages.
+
+The architecture of the engine was originally motivated by that presented in
 `Deep Speech: Scaling up end-to-end speech recognition <http://arxiv.org/abs/1412.5567>`_.
-The core of the system is a bidirectional recurrent neural network (BRNN)
+However, the engine currently differs in many respects from the engine it was
+originally motivated by. The core of the engine is a recurrent neural network (RNN)
 trained to ingest speech spectrograms and generate English text transcriptions.
 
 Let a single utterance :math:`x` and label :math:`y` be sampled from a training set
@@ -14,19 +21,19 @@ Let a single utterance :math:`x` and label :math:`y` be sampled from a training
 Each utterance, :math:`x^{(i)}` is a time-series of length :math:`T^{(i)}`
 where every time-slice is a vector of audio features,
 :math:`x^{(i)}_t` where :math:`t=1,\ldots,T^{(i)}`.
-We use MFCC as our features; so :math:`x^{(i)}_{t,p}` denotes the :math:`p`-th MFCC feature
-in the audio frame at time :math:`t`. The goal of our BRNN is to convert an input
+We use MFCC coefficients as our features; so :math:`x^{(i)}_{t,p}` denotes the :math:`p`-th MFCC feature
+in the audio frame at time :math:`t`. The goal of our RNN is to convert an input
 sequence :math:`x` into a sequence of character probabilities for the transcription
 :math:`y`, with :math:`\hat{y}_t =\mathbb{P}(c_t \mid x)`,
-where :math:`c_t \in \{a,b,c, . . . , z, space, apostrophe, blank\}`.
+where for English :math:`c_t \in \{a,b,c, . . . , z, space, apostrophe, blank\}`.
 (The significance of :math:`blank` will be explained below.)
 
-Our BRNN model is composed of :math:`5` layers of hidden units.
+Our RNN model is composed of :math:`5` layers of hidden units.
 For an input :math:`x`, the hidden units at layer :math:`l` are denoted :math:`h^{(l)}` with the
 convention that :math:`h^{(0)}` is the input. The first three layers are not recurrent.
 For the first layer, at each time :math:`t`, the output depends on the MFCC frame
 :math:`x_t` along with a context of :math:`C` frames on each side.
-(We typically use :math:`C \in \{5, 7, 9\}` for our experiments.)
+(We use :math:`C = 9` for our experiments.)
 The remaining non-recurrent layers operate on independent data for each time step.
 Thus, for each time :math:`t`, the first :math:`3` layers are computed by:
 
@@ -35,28 +42,24 @@ Thus, for each time :math:`t`, the first :math:`3` layers are computed by:
 
 where :math:`g(z) = \min\{\max\{0, z\}, 20\}` is a clipped rectified-linear (ReLu)
 activation function and :math:`W^{(l)}`, :math:`b^{(l)}` are the weight matrix and bias
-parameters for layer :math:`l`. The fourth layer is a bidirectional recurrent
-layer `[1] <http://www.di.ufpe.br/~fnj/RNA/bibliografia/BRNN.pdf>`_.
-This layer includes two sets of hidden units: a set with forward recurrence,
-:math:`h^{(f)}`, and a set with backward recurrence :math:`h^{(b)}`:
+parameters for layer :math:`l`. The fourth layer is a recurrent
+layer `[1] <https://en.wikipedia.org/wiki/Recurrent_neural_network>`_.
+This layer includes a set of hidden units with forward recurrence,
+:math:`h^{(f)}`:
 
 .. math::
     h^{(f)}_t = g(W^{(4)} h^{(3)}_t + W^{(f)}_r h^{(f)}_{t-1} + b^{(4)})
 
-    h^{(b)}_t = g(W^{(4)} h^{(3)}_t + W^{(b)}_r h^{(b)}_{t+1} + b^{(4)})
-
 Note that :math:`h^{(f)}` must be computed sequentially from :math:`t = 1` to :math:`t = T^{(i)}`
-for the :math:`i`-th utterance, while the units :math:`h^{(b)}` must be computed
-sequentially in reverse from :math:`t = T^{(i)}` to :math:`t = 1`.
+for the :math:`i`-th utterance.
 
-The fifth (non-recurrent) layer takes both the forward and backward units as inputs
+The fifth (non-recurrent) layer takes the forward units as inputs
 
 .. math::
-    h^{(5)} = g(W^{(5)} h^{(4)} + b^{(5)})
+    h^{(5)} = g(W^{(5)} h^{(f)} + b^{(5)}).
 
-where :math:`h^{(4)} = h^{(f)} + h^{(b)}`. The output layer are standard logits that
-correspond to the predicted character probabilities for each time slice :math:`t` and
-character :math:`k` in the alphabet:
+The output layer is standard logits that correspond to the predicted character probabilities
+for each time slice :math:`t` and character :math:`k` in the alphabet:
 
 .. math::
     h^{(6)}_{t,k} = \hat{y}_{t,k} = (W^{(6)} h^{(5)}_t)_k + b^{(6)}_k
@@ -66,14 +69,15 @@ element of the matrix product.
 
 Once we have computed a prediction for :math:`\hat{y}_{t,k}`, we compute the CTC loss
 `[2] <http://www.cs.toronto.edu/~graves/preprint.pdf>`_ :math:`\cal{L}(\hat{y}, y)`
-to measure the error in prediction. During training, we can evaluate the gradient
+to measure the error in prediction. (The CTC loss requires the :math:`blank` above
+to indicate transitions between characters.) During training, we can evaluate the gradient
 :math:`\nabla \cal{L}(\hat{y}, y)` with respect to the network outputs given the
 ground-truth character sequence :math:`y`. From this point, computing the gradient
 with respect to all of the model parameters may be done via back-propagation
 through the rest of the network. We use the Adam method for training
 `[3] <http://arxiv.org/abs/1412.6980>`_.
 
-The complete BRNN model is illustrated in the figure below.
+The complete RNN model is illustrated in the figure below.
 
-.. image:: ../images/rnn_fig-624x548.png
+.. image:: ../images/rnn_fig-624x598.png
     :alt: DeepSpeech BRNN
diff --git a/images/rnn_fig-624x548.png b/images/rnn_fig-624x548.png
deleted file mode 100644
index 0f288bad..00000000
Binary files a/images/rnn_fig-624x548.png and /dev/null differ
diff --git a/images/rnn_fig-624x598.png b/images/rnn_fig-624x598.png
new file mode 100644
index 00000000..ecc79322
Binary files /dev/null and b/images/rnn_fig-624x598.png differ